[Date Prev][Date Next] [Chronological] [Thread] [Top]

Re: [mizar] Size of Mizar/MML



Hello,

I am not sure what you mean by "autogenerated theorems", I do not think Mizar has such a feature.

Two bits of advice spring to mind:

Advice 1: Use the abstract files. If you use the latest MML, the $MIZFILES/abstr/ directory will number the theorems and definitions automatically for you. After thinking about it for a bit, I think this is "the right way to go". These are the local files corresponding to what's available online (https://mizar.uwb.edu.pl/version/current/html/).

You can simply "grep -r theorem $MIZFILES/abstr/ | wc -l" to get the number of theorems (for me, 75571) and  "grep -r ":: .+:sch " $MIZFILES/abstr/ | wc -l" counts the number of schemes (for me, 914). Note that this will not count "lemmas" which are local to files, but "egrep -nr "Lm[0-9]+:" $MIZFILES/mml/ | wc -l" suggests there are 7345 lemma; I would not count them, personally, because they are often made into theorems later in the article (or they prove half of an "iff" statement, or...).

Definitions are a bit trickier, since multiple definitions may appear within a single "definition block". One crude estimate may be give by looking at the number of times ":: ARTICLE:def " appears (running "grep -nr ":: .+:def "  $MIZFILES  abstr/ | wc -l" produces 14134). A similar grep may be done for synonyms (495) and antonyms (138), which are "abbreviations" of sorts.

These numbers are for mizar-8.1.15_5.94.1493 which is the latest version of the MML. 

I should stress, these numbers are approximate. In particular, the theorem count will also include comments like ":: Binomial theorem" (the command "egrep -nr "::.+theorem" $MIZFILES/abstr/ | wc -l" tells me there are only 432 instances of this, which means there are 75139 theorems in the MML).

I am certain there is a cleverer way to do a more accurate count of these things.

Advice 2: About axioms. Also, the articles (files) TARSKI_0 and TARSKI_A contain the axioms to the underlying set theory, so presumably you'd want to exclude them? These are merely 6 axioms and 1 scheme.

Best,
Alex

On Thu, Jun 5, 2025 at 8:56 AM Fabian Huch huch _AT_ in.tum.de <owner-mizar-forum@mizar.uwb.edu.pl> wrote:
Dear Mizar Community,

I am currently creating a comparison of large libraries of ITPs. I am
only considering libraries with more than 1M (nonempty) LOC, of which I
know (in alphabetical order): ACL2, HOL4, Isabelle, Lean, Metamath,
Mizar, PVS, Rocq.
To that end, I want to approximate the number of definitions (incl.
axioms / excl. abbreviations, instances, fixed values/constants,
examples) and proven proper theorems (excl. automatically generated
ones) by simple analysis of the sources, without running the system.

For Mizar/MML, my approach would be to look at tokens in .miz files and
count:

For definitions: "definition"

For theorems: "theorem" and "scheme"


Does this sound reasonable? I am missing something? Any hints are
appreciated.


Best,

Fabian Huch