**If ****A**** represents the prior belief and ****B**** represents the new evidence then:**

*P(A)*is known as the**prior probability.**A prior probability can be**informative**meaning we have a strong prior belief, or**uninformative**meaning we have a much more uncertain prior understanding of the parameter’s true value.*P(B|A)*is known as the**likelihood function**. The probability of the recent data/evidence given that A is true. This allows us to quantify to what degree the evidence agrees with our prior beliefs.

*P(A|B)* is known as the **posterior probability. **The probability of A after taking into account the evidence

**Boltzmann machine**: Each undirected edge represents dependency. In this example there are 3 hidden units and 4 visible units. This is a restricted Boltzmann machine. Restricted Boltzmann Machines are probabilistic. As opposed to assigning discrete values the model assigns probabilities. At each point in time the RBM is in a certain state. The state refers to the values of neurons in the visible and hidden layers v and h. This is the point where Restricted Boltzmann Machines meets Physics for the second time.

The** joint distribution **is known in Physics as the Boltzmann Distribution which gives the probability that a particle can be observed in the state with the energy E.

As in Physics we assign a probability to observe a state of v and h, that depends on the overall energy of the model. Unfortunately it is very difficult to calculate the joint probability due to the huge number of possible combination of v and h in the partition function Z. Much easier is the calculation of the conditional probabilities of state h given the state v and conditional probabilities of state v given the state h and so on. the essential is here, energy-based probability Reconstruction is different from regression or classification in that it estimates the probability distribution of the original input instead of associating a continuous/discrete value to an input example.

Boltzmann Machine as a Pontryagin Observer in Sensor Network

Bayesian networks are directed acyclic graphs (DAGs) whose nodes represent variables in the Bayesian sense: they may be observable quantities, latent variables, unknown parameters or hypotheses. Edges represent conditional dependencies; nodes that are not connected (no path connects one node to another) represent variables that are conditionally independent of each other. Each node is associated with a probability function that takes, as input, a particular set of values for the node's parent variables, and gives (as output) the probability (or probability distribution, if applicable) of the variable represented by the node.

Gibbs sampling is applicable when the joint distribution is not known explicitly or is difficult to sample from directly, but the conditional distribution of each variable is known and is easy (or at least, easier) to sample from.

The Gibbs sampling algorithm generates an instance from the distribution of each variable in turn, conditional on the current values of the other variable. Gibbs sampling is particularly well-adapted to sampling the posterior distribution of a Bayesian network, since Bayesian networks are typically specified as a collection of conditional distributions.

Given an input vector v we are using p(h|v) for prediction of the hidden values h. Knowing the hidden values we use p(v|h)for prediction of new input values v. This process is repeated k times. After k iterations we obtain an other input vector v_k which was recreated from original input values v_0

Measurements of any kind, in any experiment, are always subject to uncertainties or errors, as they are more often called. Measurement process is, in fact, a random process described by an abstract probability distribution whose parameters contain the information desired. The results of a measurement are then samples from this distribution which allow an estimate of the theoretical parameters. In this view, measurement errors can be seen then as sampling errors.

Digital Twin design and development requires Mathematical model of a given physical process. Neural Network is getting used to model a system by using a given data set and associated labels on a given data set. Pontryagin duality and Pontryagin Observer appear to be helpful in creating Digital Twin. Boltzmann machine is used to model a physical process by using a Neural Network. Whether it be probability, statistics, Data Science, Machine Learning, Deep Learning, or any other likewise field, having the knowledge of the distribution of data is a must or crucial, because it helps in dealing with data.

**Global Energy is**

**Learning Process**