[Date Prev][Date Next]
[Chronological]
[Thread]
[Top]
[mizar] copy/paste detection in MML
Hi,
at http://lipa.ms.mff.cuni.cz/~urban/mmlcpd/mmlcpd.4.87.985/cpd/ are
results of running the CPD (Copy/paste Detector,
http://pmd.sourceforge.net/cpd.html) on each MML article. About four
thousands copied blocks were detected, use
http://lipa.ms.mff.cuni.cz/~urban/mmlcpd/mmlcpd.4.87.985/cpd/?C=S;O=D to
sort the articles by their amount of copying.
I'll probably also add info about inter-article copying later. The
detection could be also improved by writing a special Mizar parser for
CPD, and using normalized versions of articles (e.g. with normalized
identifier names - could be done by simple postprocessing of the XML
representation).
I hope this info will be used to gradually get rid of the worst copyings.
I also suggest to use tools like CPD as a part of the reviewing process
for MML articles.
Best,
Josef Urban