History

Mathematical gnostics is an alternative (non-statistical) approach based on a new paradigm of quantitative uncertainty. It originated on borders of several scientific fields inspired by ideas and methods of mathematics and geometry, measurement theory, thermodynamics, mechanics and statistics. Its development was connected with the professional activity of its author, Pavel Kovanic, reflected in his bibliography. Educated in high-voltage electrical engineering and retrained in nuclear technology, he was attracted by problems of nuclear reactor control and statistical data treatment in nuclear research. But the politically motivated end of his professional career in nuclear engineering came with the opportunity to work among specialists in statistics and cybernetics at the Institute of Information Theory and Automation of the Czechoslovak Academy of Sciences, Prague.

The experience with the multidimensional statistical models resulted in the Minimum-Penalty Estimate, which enabled to optimize the compromise between unbiased and biased minimum-variance estimation. However, it also led to understanding, that losses of data information caused by giving the same weight determined by the „collective“ variance of a data sample to all – „bad“ and „good“ – data are unnecessary.

The individual weight determined by estimated data error used in robust statistics could improve the estimate. However, this idea led to the availability of a lot of „influence“ functions valid under some specific assumptions about data models and not working elsewhere. Robustness with respect to outliers improved, but other non-robustness arose by dependence on the subjective assumptions on data nature.

But there also was a fundamental problem: a non-linear data weighing was equivalent to introducing a Riemannian metric instead of the Euclidean one, which lies in the fundament of statistics. But according to B. Riemann, determination of metric of a real curved space should not be a task for mathematicians: „Metrics are given objectively by laws of Nature“. This idea was confirmed e.g. by Minkowskian metric of the special relativistic theory determined by the limited speed of light as well as by metric of the cosmos determined by gravitation fields in Einstein’s gravitation theory.

Another problem of the statistical approach was the reliance of many proofs of statistical statements on the Central Limit Theorem, the validity of which is limited to „large“ randomly selected data samples having a distribution with the mean and standard deviation. But many applications do not support such a data model. It was obvious, that a Law of Nature more universally applicable even to individual uncertain data and to small data sample should be found to justify the measuring and composition of uncertain data.

This motivation led to the gnostic theory of individual uncertain data and small samples. Metric of the space of uncertain data has been shown to result from structural features of properly quantified real uncertain data. Extremals of this space describe the nature of data uncertainty and enable the optimum estimation path to be determined to minimize the data uncertainty. Entropy increase and information loss caused by the uncertainty can be derived by using the classical (non-statistical, Clausius’s) entropy for an individual uncertain data item. Probability distribution of such a data item is the final result of proving the equation of the mutual conversion of entropy and information (recalling the idea of „Maxwell’s demon“). Fundamental characteristics of an uncertain data item (the irrelevance – „data error“ and its integral, data weight) are shown to be isomorphic with the pair momentum and energy of a free relativistic particle. This mapping is Lorentz-invariant, i.e. valid for all amounts of uncertainty (and for all corresponding velocities of the particle). The Lorentz-invariant uncertainty characteristics (the „quantifying“ ones) irrelevance and data weights thus have their estimating counterparts. Estimating characteristics differ from quantifying ones by their natural robustness: estimation is robust with respect to outliers while quantification is robust to inliers (incresing the weights of peripheral data).

The uncertainty ? mechanics mapping implies validity of the additive composition of irrelevances and the same law for data weights for the quantification process. This means, that the composition law for uncertain data is justified by the Energy-Momentum Conservation Law of relativistic mechanics.

It is well-known from the history of sciences, that promotion of a new paradigm represented always a difficult process. No wonder, that paradigm of attaching entropy, information and probability to a single data item, justifying their non-linear measuring by using non-Euclidean geometries and supporting its statement by thermodynamics and relativistic mechanics was not met favourably by the scientific environment full of statisticians. Continuation of this type of research in this environment was possible only due to the support of some few colleagues, minds of which were open enough to cross the boundary of statistical paradigm. Another positive support came from the industry, where the new methodology in the form of gnostic software was reaping successes.

The development of gnostic functions was going in parallel with the progress of computing technology from small programmed calculators to modern PC’s, because the large computers running in batch regime did not provide sufficiently fast feed-back required by the intricate functions.

Two modern mathematical and statistical computing environments deserve to be mentioned in this connection:

  1. The S-PLUS™ (www.insightful.com) , which was used for development of a broad scale of gnostic functions by using its S-language in the long term. Many tests and applications were thus enabled.
  2. The environment of the R-project (www.r-project.org). R is a language and environment for statistical computing and graphics. It is a GNU project which is similar to the S language and environment. R is available as Free Software under the terms of the Free Software Foundation’s GNU General Public License in source code form.

Starting in 2000, the field of Health Risk Assessment and the monitoring of pollutions in the environment became the prevailing focus of applications of mathematical gnostics. Cooperation with the Institute of Public Health, Ostrava (http://www.zuova.cz/) turned up to be fruitful for both development and application of the mathematical gnostics within the framework of three research projects of the European Union:

  1. MAGIC: Management of Groundwater at Industrially Contaminated Areas (www.magic-cadses.com),
  2. 2-FUN: Full-chain and UNcertainty Approaches for Assessing Health Risks in Future ENvironmental Scenarios (2-fun.org),
  3. FOKS: Focus on Key Sources of Environmental Risks (projectfoks.eu).

© 2017–2018,  Pavel Kovanic and Zdeněk Wagner