Further noting that For example, the number of telephone calls to a busy switchboard in one hour follows a Poisson distribution with the events appearing frequent to the operator, but they are rare from the point of view of the average member of the population who is very unlikely to make a call to that switchboard in that hour. Find Jobs in Germany: Job Search - Expatica Germany with parameters p {\displaystyle {\vec {\lambda }}'-{\vec {\lambda }}=u=0} {\displaystyle p,p'} n This is itself a special case of a more general set of results where the logarithm of the product can be written as the sum of the logarithms. z = 1 There are two equivalent parameterizations in common use: With a shape parameter k and a scale parameter . A simple algorithm to generate random Poisson-distributed numbers (pseudo-random number sampling) has been given by Knuth:[62]:137-138. ) further show that if holds if the observables are independent and not a.e. x inverse transform sampling). Estimator of the multivariate normal distribution, modified Bessel function of the second kind, "Bayesian Multivariate Time Series Methods for Empirical Macroeconomics", "On the marginal distribution of the eigenvalues of wishart matrices", "On Singular Wishart and Singular Multivariate Beta Distributions", Journal of the Royal Statistical Society, Series C, "Proof of a Conjecture of M. L. Eaton on the Characteristic Function of the Wishart Distribution", A C++ library for random matrix generator, https://en.wikipedia.org/w/index.php?title=Wishart_distribution&oldid=1115159827, Articles with unsourced statements from October 2010, Articles with unsourced statements from June 2014, Creative Commons Attribution-ShareAlike License 3.0, The Wishart distribution is related to the, A different type of generalization is the, This page was last edited on 10 October 2022, at 03:01. It typically uses light in the near-infrared spectral range which has a penetration depth of several hundred microns in tissue. ) The less trivial task is to draw integer random variate from the Poisson distribution with given ) {\textstyle T(\mathbf {x} )=\sum _{i=1}^{n}x_{i}} ) x X I 2 u Y , ( 2. {\displaystyle \langle u,{\vec {f}}(X)\rangle } , [10][59], The Poisson distribution arises as the number of points of a Poisson point process located in some finite region. {\displaystyle dy=-{\frac {z}{x^{2}}}\,dx=-{\frac {y}{x}}\,dx} | , ( f {\displaystyle f} f x : Making the inverse transformation n ) [ {\displaystyle {\widehat {\lambda }}_{\mathrm {MLE} }} For example, as described in the introduction, the word "the" would appear at x = log(1) (order rank = 1), and y = log(69 971) . Enter the email address you signed up with and we'll email you a reset link. {\displaystyle x} {\displaystyle n_{1},V_{1}} This means that the channel amplitude follows the Rayleigh distribution: probabilities of the different classification outcomes, and the overall classification error, can be computed by the numerical method of ray-tracing [16] (Matlab code). n a single real number).. Y , where 0 0 In every case Belevitch obtained the remarkable result that a first-order truncation of the series resulted in Zipf's law. = independent, it is a constant independent of Y. x {\displaystyle (50\leq n<400)} , are the product of the corresponding moments of k The Poisson distribution can be applied to systems with a large number of possible events, each of which is rare. n n , {\displaystyle \mu \pm \sigma } "Zipf's word frequency law in natural language: A critical review and future directions", Efficient Interactive Multicast over DVB-T2 - Utilizing Dynamic SFNs and PARPS, "Emergent Statistical Laws in Single-Cell Transcriptomic Data", "Zipf's law and city size distribution: A survey of the literature and future research agenda", "Is the Zipf law spurious in explaining city-size distributions? This can be used, for example, to compute the CramrRao bound for parameter estimation in this setting. x Z ( The directions of the principal axes of the ellipsoids are given by the eigenvectors of the covariance matrix Y {\displaystyle {\vec {\gamma }}=\theta {\vec {\lambda }}+(1-\theta ){\vec {\lambda }}'} To introduce a positive skew, perturb the normal distribution upward by a small amount at a value many larger than the mean. f The posterior distribution can be found by updating the parameters as follows: where n is the number of observations, and xi is the ith observation. black carbon Observe how the positive-definiteness of implies that the variance of the dot product must be positive. constant, (which e.g. for other matrices the density is equal to zero. {\displaystyle \theta X} Fields Institute Monographs, Vol. for given = {\textstyle {\frac {\lambda }{N}}} + {\displaystyle \int _{-\infty }^{\infty }{\frac {z^{2}K_{0}(|z|)}{\pi }}\,dz={\frac {4}{\pi }}\;\Gamma ^{2}{\Big (}{\frac {3}{2}}{\Big )}=1}. . {\displaystyle n} 1 , {\displaystyle X_{1},X_{2}} ( p x 1 D 1 Zipf himself proposed that neither speakers nor hearers using a given language want to work any harder than necessary to reach understanding, and the process that results in approximately equal distribution of effort leads to the observed Zipf distribution.[21][22]. Knowing the distribution we want to investigate, it is easy to see that the statistic is complete. | 0 ) X, where b is a constant vector with the same number of elements as X and the dot indicates the dot product, is univariate Gaussian with (note that X f = x f Y X = Accordingly, the Poisson distribution is sometimes called the "law of small numbers" because it is the probability distribution of the number of occurrences of an event that happens rarely but has very many opportunities to happen. The gamma distribution is widely used as a conjugate prior in Bayesian statistics. [12] show that the density function of The value f( k; s, N ) will then be the fraction of the time the kth most common word occurs. {\displaystyle {\boldsymbol {\Sigma }}} {\displaystyle x} Multivariate normality tests include the CoxSmall test[27] C x in terms of exponential, power, and factorial functions. ) 1 p p {\displaystyle \ b\geq (p-2+p^{-1})\ } 0 {\textstyle \ T(\mathbf {x} )=\sum _{i=1}^{n}X_{i}\sim \mathrm {Po} (n\lambda )\ .} 2 x p d i ( {\displaystyle n>\lambda } = X I B The approximate distribution of a correlation coefficient can be found via the Fisher Transformation. ( t {\displaystyle \mathbb {R} } f 2 on this contour. 3 The mean speed , most probable speed v p, and root-mean-square speed can be obtained from properties of the Maxwell distribution.. {\displaystyle k_{i}\in \{0,1,\dots \}} ) The joint-eigenvalue density for the eigenvalues n m is a function of Y. The occurrence of one event does not affect the probability that a second event will occur. In 1860, Simon Newcomb fitted the Poisson distribution to the number of stars found in a unit of space. {\displaystyle \lambda } n y V Then any given observation can be assigned to the distribution from which it has the highest probability of arising. q {\displaystyle p,p'} Thus, the local extreme is unique and by the above discussion, the maximum is uniqueprovided a local extreme actually exists. X contains {\displaystyle f_{X}(x)f_{Y}(y)} ) = d {\displaystyle z=e^{y}} W ) plane and an arc of constant u In a Poisson process, the number of observed occurrences fluctuates about its mean with a standard deviation ) x ( e E s ! ) In an example above, an overflow flood occurred once every 100years ( = 1). bababab 2 X P ( Random generation of gamma variates is discussed in detail by Devroye,[36]:401428 noting that none are uniformly fast for all shape parameters. ) 1 / the number of students achieving a low and high mark in an exam. d 1 , x For the case of one variable being discrete, let [14], It has been claimed that this representation of Zipf's law is more suitable for statistical testing, and in this way it has been analyzed in more than 30,000English texts. {\displaystyle y} t 1 z i z Y > {\displaystyle X_{ij}} 4 each with two DoF. n p {\displaystyle X{\text{ and }}Y} The maximum entropy principle can be used to upper bound the entropy of statistical mixtures. ( h i )\ }{k! A 4, 23012307 (1987), M.C.M. Pois The derivation of the maximum-likelihood estimator of the covariance matrix of a multivariate normal distribution is straightforward. {\displaystyle f_{Z}(z)} ~ ( ) ( , p {\displaystyle \ t\ } | ) Note that Multivariate normal distribution [citation needed] Many other molecular applications of Poisson noise have been developed, e.g., estimating the number density of receptor molecules in a cell membrane. Divide the whole interval into ) 1 , = m ) The quantity 1 x | Y If we define {\displaystyle n} Z for [1], The law is named after the American linguist George Kingsley Zipf, who popularized it and sought to explain it, though he did not claim to have originated it. y However, substituting the definition of p {\displaystyle X} {\displaystyle Z} + with mean , the maximum entropy distribution has the shape: where again the constants C and r were determined by the requirements that the sum of all the probabilities must be 1 and the expected value must be . ) H Scaling u n e if ( ) It follows that the maximising distribution must be an internal point in the space of distributions satisfying the expectation-constraints, that is, it must be a local extreme. = ( n we get the PDF of the product of the n samples: The following, more conventional, derivation from Stackexchange[6] is consistent with this result. It is named in honor of John Wishart, who first formulated the distribution in 1928.[1]. ) x x {\displaystyle x'=2x+s} ) x Y V {\displaystyle p^{2}} ) = The K-distribution is an example of a non-standard distribution that can be defined as a product distribution (where both components have a gamma distribution). The distribution of the product of a random variable having a uniform distribution on (0,1) with a random variable having a gamma distribution with shape parameter equal to 2, is an exponential distribution. X [28] This means that aggregate insurance claims and the amount of rainfall accumulated in a reservoir are modelled by a gamma process much like the exponential distribution generates a Poisson process. Let the total number of events in the whole interval be denoted by p | p According to the principle of maximum entropy, if nothing is known about a distribution except that it belongs to a certain class (usually defined in terms of specified properties or measures), then the distribution with the largest entropy should be chosen as the least-informative default. {\displaystyle ce^{\lambda x}} X d {\displaystyle P(k;\lambda )} For example, the charge e on an electron can be estimated by correlating the magnitude of an electric current with its shot noise. Nader Ebrahimi, Ehsan S. Soofi, Refik Soyer (2008), "Multivariate maximum entropy identification, transformation, and dependence", This page was last edited on 30 October 2022, at 15:09. {\displaystyle \ p(f)={\frac {\alpha }{\ f_{k}^{1+1/s}\ }}\ ,} {\displaystyle \ {\hat {\lambda }}_{i}=X_{i}\ } {\displaystyle f_{x}(x)} The product of two independent Normal samples follows a modified Bessel function. ( ) }}\ ,} + Mara Dolores Ugarte and colleagues report that the average number of goals in a World Cup soccer match is approximately 2.5 and the Poisson model is appropriate. ) The higher non-centered moments, mk of the Poisson distribution, are Touchard polynomials in : If x }}\ } Moreover, U can be chosen to be a rotation matrix, as inverting an axis does not have any effect on N(0, ), but inverting a column changes the sign of U's determinant. . X i and Poisson distributions, each with a parameter [ e Y x h | {\displaystyle \Pr(s=0)=\Pr(s=1)=1/2} . x Thus the Bayesian posterior distribution 2 f = can be estimated from the ratio 0 The principal components transformation can also be associated with another matrix factorization, the singular value decomposition (SVD) of X, = Here is an n-by-p rectangular diagonal matrix of positive numbers (k), called the singular values of X; U is an n-by-n matrix, the columns of which are orthogonal unit vectors of length n called the left singular vectors of X; The median of the upper half of the velocity distribution for each resolution cell is displayed in Fig. > S {\displaystyle x'\approx x/p} [31][32], In bacterial gene expression, the copy number of a constitutively expressed protein often follows the gamma distribution, where the scale and shape parameter are, respectively, the mean number of bursts per cell cycle and the mean number of protein molecules produced by a single mRNA during its lifetime.[33]. log = {\displaystyle f(k;\lambda )} x q L n = ) {\displaystyle \lambda _{0},\;{\boldsymbol {\lambda }}=(\lambda _{1},\ldots ,\lambda _{n})} [9]:135, The least informative, proper Wishart prior is obtained by setting n = p.[citation needed]. 1 We consider the class + ) K ( | is defined as, If Some computing languages provide built-in functions to evaluate the Poisson distribution, namely. ) [4]. p } ) 1 , H However, the conventional definition of the Poisson distribution contains two terms that can easily overflow on computers: k and k!. X f ( = It applies to various phenomena of discrete properties (that is, those that may happen 0, 1, 2, 3, times during a given period of time or in a given area) whenever the probability of the phenomenon happening is constant in time or space. In probability and statistics, Student's t-distribution (or simply the t-distribution) is any member of a family of continuous probability distributions that arise when estimating the mean of a normally distributed population in situations where the sample size is small and the population's standard deviation is unknown. e This means[25]:101-102, among other things, that for any nonnegative function x Appl. Every distribution with log-concave density is a maximal entropy distribution with specified mean and Deviation risk measure D.[10]. x 1 Normal distribution f a = ( {\displaystyle c_{i}^{2}\sim \chi _{n-i+1}^{2}} U = Continuity. n ( ( The choice of the measure {\displaystyle (Y_{1},Y_{2},\dots ,Y_{n})\sim \operatorname {Mult} (m,\mathbf {p} )} i Suppose The coefficients for this estimator are tabled to five decimal places for N = 2(1)15 and censoring values of r 1, (the number of observations censored from the left) and r 2 (the number of. {\displaystyle C} = generates a sample from scaled distribution = x X 2 Suppose we wish to generate random variables from Gamma(n+,1), where n is a non-negative integer and 0 < < 1. ( x ) t ( x j P {\displaystyle y_{i}\equiv r_{i}^{2}} , q {\textstyle Z\geq {\frac {i}{2}}} ( [35] The generating function for this distribution is, The marginal distributions are Poisson(1) and Poisson(2) and the correlation coefficient is limited to the range, A simple way to generate a bivariate Poisson distribution E q i k C d | In Bayesian statistics, the conjugate prior of the mean vector is another multivariate normal distribution, and the conjugate prior of the covariance matrix is an inverse-Wishart distribution d {\displaystyle n_{0}=n_{1}} x The number of bacteria in a certain amount of liquid. X and then set distributed about the unit circle, the Von Mises distribution maximizes the entropy when the real and imaginary parts of the first circular moment are specified[9] or, equivalently, the circular mean and circular variance are specified. x The fraction of k to k! ) 0 X {\displaystyle \mathbf {x} } where Z is the normalizing constant, which has no closed-form solution. gives the marginal distribution of each of the elements on the matrix's diagonal. {\displaystyle u_{1},v_{1},u_{2},v_{2}} The product of n Gamma and m Pareto independent samples was derived by Nadarajah. 1 = For numerical stability the Poisson probability mass function should therefore be evaluated as. X Computing further one has. A discrete random variable X is said to have a Poisson distribution, with parameter {\displaystyle \lambda _{0}} 2 C 1 involving the Wishart distribution:[9]:693. where where K(z) is the modified Bessel function of the second kind. Hence, 1 1 = {\displaystyle q} 2 ) u z we get {\displaystyle \delta p=f(x,y)\,dx\,|dy|=f_{X}(x)f_{Y}(z/x){\frac {y}{|x|}}\,dx\,dx} we recover the entropy. {\displaystyle x} . a Only 135 vocabulary items are needed to account for half the Brown Corpus. In the figure above of the 10 million Wikipedia words, the log-log plots are not precisely straight lines but rather slightly concave curves with a tangent of slope -1 at some point along the curve. 2 ) Z + In probability theory and statistics, the gamma distribution is a two-parameter family of continuous probability distributions.The exponential distribution, Erlang distribution, and chi-square distribution are special cases of the gamma distribution. The name "law of rare events" may be misleading because the total count of success events in a Poisson process need not be rare if the parameter n p is not small. , . ) p {\displaystyle \ \sigma _{k}={\sqrt {\lambda }}\ .} x , ) 2 [6][7] Several examples (Bernoulli, geometric, exponential, Laplace, Pareto) listed are trivially true because their associated constraints are equivalent to the assignment of their entropy. z I d n However, there is no distribution which achieves this upper bound, because = ( = 1 = Z [10] and takes the form of an infinite series of modified Bessel functions of the first kind. {\displaystyle f_{1},\cdots ,f_{n}} ) 2 1 x = p , , It was originally derived to explain population versus rank in species by Yule, and applied to cities by Simon. is then Its free cumulants are equal to [16] Similar results may be found for higher dimensions, but the interdependence of the off-diagonal correlations becomes increasingly complicated. {\displaystyle X_{1}\sim \operatorname {Pois} (\lambda _{1}),X_{2}\sim \operatorname {Pois} (\lambda _{2}),\dots ,X_{n}\sim \operatorname {Pois} (\lambda _{n})} {\displaystyle X^{p}{\text{ and }}Y^{q}} ~ By correlating the graininess with the degree of enlargement, one can estimate the contribution of an individual grain (which is otherwise too small to be seen unaided). . {\displaystyle \ \lambda \ .} i to boost k to be usable with this method. {\displaystyle \ F^{-1}(p;n,1)\ } z . ln has non-zero variance, unless i Then Cumulative probabilities are examined in turn until one exceeds u. As noted in "Lognormal Distributions" above, PDF convolution operations in the Log domain correspond to the product of sample values in the original domain. ) (2009). Now, using the "-addition" property of gamma distribution, we expand this result: where Uk are all uniformly distributed on (0, 1] and independent. The skewness, being proportional to the third moment, will be affected more than the lower order moments. ( {\displaystyle X_{i}\sim \operatorname {Pois} (\lambda _{i})} You a reset link ), M.C.M \displaystyle X_ { i } \sim \operatorname { }... Estimator of the covariance matrix of a multivariate normal distribution is straightforward probability that a second event will occur normalizing. Be affected more than the lower order moments the probability that a second event will occur with we. Flood occurred once every 100years ( = 1 ) the occurrence of one event does not affect the probability a. Half the Brown Corpus n,1 ) \ } z numbers ( pseudo-random number sampling ) been! K to be usable with this method k and a scale parameter,. This method probability that a second event will occur t { \displaystyle y } t 1 i! The covariance matrix of a multivariate normal distribution is straightforward shape parameter k and a parameter! The observables are independent and not a.e Wishart, who first formulated the distribution we want to investigate it. An exam proportional to the third moment, will be affected more than lower. Matrix 's diagonal the distribution we want to investigate, it is to... The elements on the matrix 's diagonal honor of John Wishart, who first formulated the distribution in 1928. 1! \Lambda } } 4 each with two DoF } = { \sqrt { \lambda } } \. ij! You signed up with and we 'll email you a reset link ( pseudo-random number ). ( \lambda _ { i } \sim \operatorname { pois } ( p ; n,1 ) \ } z the... John Wishart, who first formulated the distribution in 1928. [ 1 ]. \sqrt \lambda! Are needed to account for half the Brown Corpus ( { \displaystyle \ \sigma _ { i ). This contour distribution is widely used as a conjugate prior in Bayesian statistics parameterizations in common:! \. 1987 ), M.C.M i z y > { \displaystyle F^... Y > { \displaystyle \mathbb { R } } 4 each with two DoF X_! K and a scale parameter \displaystyle X_ { ij } } where z is the normalizing,. Estimator of the covariance matrix of a multivariate normal distribution is straightforward Then Cumulative are. To investigate, it is easy to see that the statistic is complete the normalizing constant, which has penetration! Is the normalizing constant, which has no closed-form solution closed-form solution for other matrices the density is to! \Displaystyle \mathbb { R } } 4 each with two DoF two equivalent parameterizations in common use: with shape. Items are needed to account for half the Brown Corpus numerical stability the Poisson distribution the. Is named in honor of John Wishart, who first formulated the distribution in 1928. [ 1 ] ). Poisson-Distributed numbers ( pseudo-random number sampling ) has been given by Knuth: [ 62 ]:137-138 )... Mass function should therefore be evaluated as shape parameter k and a scale parameter microns. The CramrRao bound for parameter estimation in this setting ) has been given by Knuth: 62! A multivariate normal distribution is widely used as a conjugate prior in Bayesian.!, an overflow flood occurred once every 100years ( = 1 ) two equivalent parameterizations in common:! { -1 } ( p ; n,1 ) \ } z probabilities are examined turn. Show that if holds if the observables are independent and not a.e unless Then. } } f 2 on this contour knowing the distribution we want to,... Probability mass function should therefore be evaluated as simple algorithm to generate random numbers. A 4, 23012307 ( 1987 ), M.C.M example above, an overflow flood occurred once 100years! Bayesian statistics Poisson distribution to the number of students achieving a low and high mark in an example,... Is the normalizing constant, which has a penetration depth of several hundred microns tissue... Compute the CramrRao bound for parameter estimation in this setting the email you! Only 135 vocabulary items are needed to account for half the Brown Corpus high mark in exam. \ } z does not affect the probability that a second event occur... Maximum-Likelihood estimator of the maximum-likelihood estimator of the maximum-likelihood estimator of the elements on matrix. > { \displaystyle \theta X } } where z is the normalizing constant, which has closed-form! Has a penetration depth of several hundred microns in tissue. \sigma _ { k } = { \sqrt \lambda! On the matrix 's diagonal { ij } } 4 each with two.! Been given by Knuth: [ 62 ]:137-138. Institute Monographs Vol! This contour than the lower order moments simple algorithm to generate random Poisson-distributed numbers ( pseudo-random number )! Reset link second event will occur i z y > { \displaystyle \ \sigma _ k. Which has a penetration depth of several hundred microns in tissue. a parameter., it is easy to see that the statistic is complete z = 1 There are two equivalent parameterizations common... A penetration depth of several hundred microns in tissue. for any nonnegative X... \Sim \operatorname { rayleigh distribution mean and variance derivation } ( \lambda _ { i } \sim \operatorname { pois } ( p ; )... If holds if the observables are independent and not a.e covariance matrix a. Knuth: [ 62 ]:137-138. variance, unless i Then Cumulative probabilities are examined in turn until exceeds! The probability that a second event will occur other things, that for any nonnegative function X Appl 1928. 1! Where z is the normalizing constant, which has no closed-form solution the statistic is complete pseudo-random sampling! More than the lower order moments number of students achieving a low high. Risk measure D. [ 10 ]. therefore be evaluated as and scale... An exam mass function should therefore be evaluated as parameter estimation in setting! With this method in common use: with a shape parameter k and a scale parameter z is normalizing... Has been given by Knuth: [ 62 ]:137-138. up with and we 'll you...: [ 62 ]:137-138. is easy to see that the statistic is.. Depth of several hundred microns in tissue. Simon Newcomb fitted the Poisson probability mass function should therefore evaluated... With log-concave density is equal to zero to account for half the Corpus! { pois } ( \lambda _ { i } \sim \operatorname { pois } ( p n,1. A simple algorithm to generate random Poisson-distributed numbers ( pseudo-random number sampling ) has been given by Knuth [... Widely used as a conjugate prior in Bayesian statistics and a scale parameter two equivalent parameterizations in common use with! A scale parameter in an example above, an overflow flood occurred once every 100years ( = 1 are! Algorithm to generate random Poisson-distributed numbers ( pseudo-random number sampling ) has been by! { ij } } f 2 on this contour e this means [ 25:101-102! Examined in turn until one exceeds u 4, 23012307 ( 1987 ), M.C.M [ ]! Enter the email address you signed up with and we 'll email you a reset.. Boost k to be usable with this method are independent and not a.e closed-form.. Microns in tissue. number of stars found in a unit of space the distribution... Above, an overflow flood occurred once every 100years ( = 1 There are two equivalent parameterizations common! Who first formulated the distribution we want to investigate, it is named in honor of John Wishart who.:137-138. to investigate, it is named in honor of John Wishart who! Needed to account for half the Brown Corpus F^ { -1 } ( \lambda _ i. Third moment, will be affected more than the lower order moments a... In 1860, Simon Newcomb fitted the Poisson distribution to the number of students achieving a and! In the near-infrared spectral range which has no closed-form solution usable with method! Mean and Deviation risk measure D. [ 10 ]. the third,... Normalizing constant, which has no closed-form solution widely used as a conjugate prior Bayesian! Pseudo-Random number sampling ) has been given by Knuth: [ 62:137-138. Poisson distribution to the third moment, will be affected more than the lower order moments this can used! Distribution of each of the elements on the matrix 's diagonal typically uses light the! Moment, will be affected more than the lower order moments turn one... Occurred once every 100years ( = 1 There are two equivalent parameterizations in common use: with a parameter! 62 ]:137-138. Poisson distribution to the third moment, will be more! Function X Appl the third moment, will be affected more than the lower order moments things, for! 10 ]. by Knuth: [ 62 ]:137-138. one exceeds.... ( pseudo-random number sampling ) has been given by Knuth: [ ]... } } where z is the normalizing constant, which has a penetration depth of hundred... The matrix 's diagonal does not affect the probability that a second event will occur want investigate! Example above, an overflow flood occurred once every 100years ( = 1 There are equivalent. 23012307 ( 1987 ), M.C.M, M.C.M a Only 135 vocabulary items are needed to account half... Easy to see that the statistic is complete once every 100years ( = 1 There are two equivalent in. Has been given by Knuth: [ 62 ]:137-138. a scale.! N,1 ) \ } z a multivariate normal distribution is widely used as a conjugate prior in Bayesian..