Probabilistic Overview of Probabilities of Default for Low Default Portfolios by K. Pluto and D. Tasche

This article gives a probabilistic overview of the widely used method of default probability estimation proposed by K. Pluto and D. Tasche. There are listed detailed assumptions and derivation of the inequality where the probability of default is involved under the influence of systematic factor. The author anticipates adding more clarity, especially for early career analysts or scholars, regarding the assumption of borrowers' independence, conditional independence and interaction between the probability distributions such as binomial, beta, normal and others. There is also shown the relation between the probability of default and the joint distribution of $\sqrt{\varrho}X-\sqrt{1-\varrho}Y$, where $X$, including but not limiting, is the standard normal, $Y$ admits, including but not limiting, the beta-normal distribution and $X,\,Y$ are independent.


Introduction
The probability of default p is actually the most important metric in credit risk management.Roughly, this probability provides the likelihood for a certain obligor not to follow the taken financial commitments properly within a certain period of time, typically one year.The number of defaulted borrowers divided by the number of total borrowers within a certain portfolio is known as the observed default rate, while the predicted one p is called the expected default rate.Any model tasked to predict p, should ensure the alignment between the observed and expected default rates.However, in some instances, such as low default portfolios, there is not possible to have any robust observations for the observed default rate.In such instances, the famous work [14] suggests applying the Bernoulli trials to estimate the experiment's success probability p.This work is a survey of two models: (a) the estimation of p when obligors in the portfolio are treated independently of each other and there is no side influence for such a portfolio, (b) the estimation of p when obligors in the portfolio are treated conditionally independent of each other when each obligor is influenced by some systematic factor.We aim to reflect on the detailed steps and assumptions used in deriving the two mentioned models.
Let us recall several well-known probability distributions.
• We say that the random variable X is binomial distributed with parameters n ∈ N and p ∈ (0, 1) (denoted X ∼ Bin(n, p)) if the probability mass function is where We denote the cumulative distribution function of the binomial random variable by Bin n, p (k) := P(X k) = k i=0 n i p i (1 − p) n−i , k = 0, 1, . . ., n.
The other distributions met in this paper, are introduced in the proper places where they are used.

Binomial and mixture binomial distributions for the probability of default estimation
In this section, in the subsections 2.1 and 2.2 respectively, we review the derivation of two methods used to estimate the probability of default p.As mentioned in Introduction 1, the first method is just the Bernoulli trials assuming the obligors' independence, while the second method provides the estimation of p under the assumption of obligors' conditional independence of each other under the influence of a certain systematic factor.

Binomial distribution
Let X 1 , X 2 , . . ., X n be independent copies of Bernoulli random variable X which distribution is P(X = 1) = p = 1 − P(X = 0).In risk management, the random variables X 1 , X 2 , . . ., X n are treated as independent obligors and the attained value X i = 1, i = 1, 2, . . ., n means that the i'th obligor defaults within some observation period (typically one year), while X i = 0, i = 1, 2, . . ., n means that the i'th obligor does not default within the same observation period.Then, the sum Y := X 1 + X 2 + . . .+ X n may attain any value form the set {0, 1, . . ., n} with probability The probability mass function of the binomial distribution (1) means the probability to default k out of total n obligors in the portfolio, while the distribution function ( 2) is the probability to default no more than k obligors out of total n.Obviously, for any p ∈ (0, 1).
In many examples, e.g., tossing a coin or a die, the experiment's success probability p is known beforehand.However, in real-life problems, such as default probability estimation, the probability p is desired to know.In order to get p out of (1) or (2) we need an expert judgment first.Let us suppose that the probability of the amount of defaulted obligors does not exceed k ∈ {0, 1, . . ., n − 1} out of n is at least 1 − γ.Then, and, in view of Proposition 1, the upper bound of default probability p is where B −1 n−k, k+1 (•) is an inverse of beta distribution function.In particular, if k = 0, i.e., we are certain with probability 1 − γ that there be no defaulted obligors at all, then p 1 The probability 1 − γ can be introduced as the probability of type I error, also known as the false positive instance classification, which in our context means that the actual probability of default does not belong to the predicted interval 0 p 1−B −1 n−k, k+1 (1−γ); see [4].Moreover, the confidence interval of the binomial distribution is known as the Clopper-Pearson interval; see [5], [3].

Mixture of Binomial and Normal distributions
Let r be an annual return rate and (1 + r/n) n , n ∈ N the increment of the invested amount when the return rate is compounded n times per year.It is well known that (1 + r/n) n → e r when n → ∞.Thus, assuming the continuously compounded return, where V F > 0 denotes the final value and V I > 0 the initial one.Based on the previous thoughts, we define The return derived in ( 5) is called the logarithmic return or just log-return.
We now assume the logarithmic return to be a random variable.More precisely, we assume where 2 ), the random variables S and ξ are independent and both non-degenerate.Also, S is known as systematic risk factor, while ξ as idiosyncratic; see [9].The origin of return's definition (6) has similarities with the capital asset pricing model which states that every expected return Er i under certain assumptions satisfies where r f is the risk-free return rate, r M the return of systemic portfolio M and β i = cov(r i , r M )/σ 2 M , see, for example, [8] and observe that (6) implies E(r log − ξ) = βES.
Let us now standardize the log-return (6).It is easy to check that Thus, it is equivalent to define r log as where and S, ξ are independent standard normal random variables.Indeed, and the coefficient in ( 8) is called the asset correlation (see [19]); it expresses the correlation between rlog and S: We now define the default event D by Of course, D is Bernoulli random variable and x p = Φ −1 (p) since the random variable √ S + √ 1 − ξ is standard normal.We now are interested in that particular p which causes D = 1.Conditioning on S, i.e., assuming that the systematic factor attains some particular value x ∈ R, for = 1, we have and The random variable where S ∼ N (0, 1), is known as Vasicek distribution, see [17].
Let D 1 , D 2 , . . ., D n be the conditionally independent copies of the random variable D when the systematic factor S = x.Then, D := D 1 + D 2 + . . .+ D n is binomial random variable and the conditional probability that where i = 0, 1, . . ., n.Thus, being certain with probability at least 1 − γ, that there default up to k ∈ {0, 1, . . ., n − 1} obligors out of total n, by the law of total probability we get Notice that if k = n, then the inequality ( 14) is satisfied with any p ∈ (0, 1) when γ ∈ [0, 1].Also, = 0 in ( 14) implies the inequality (3).Equally, the integral in ( 14) is nothing but the mixture of the binomial and Vasicek distributions: it is the cumulative binomial distribution function Bin n, p (k), k = 0, 1, . . ., n when the parameter p is Vasicek distributed (12).See [13] for the mixture distribution models.According to Proposition 2, the upper bound of p in ( 14) is where F −1 n−k, k+1, (•) is the inverse of the cumulative distribution function It is not easy to get a more convenient expression of the cumulative distribution function F n−k, k+1, (y) in (16).Thus, we should search for the quantiles of the underlying distribution, described by F n−k, k+1, (y), numerically; see Section 5. Of course, the function F n−k, k+1, (y) is defined in view of Proposition 2 by replacing in (19) and there is equivalent to search for such p ∈ (0, 1) that where Fn−k, k+1, (p), p ∈ (0, 1) is the continuous cumulative distribution function with respect to p.
Let us mention that the probability distribution, described by its cumulative distribution function is known as beta-normal.We write X ∼ BN (α, β, µ, σ 2 ) if X is the betanormal random variable and bn α, β, µ, σ 2 (x), x ∈ R denotes its density.See [6], [7], [15] and [10] for the beta-normal distribution.Thus, F α, β, (y) in ( 16) can be easily described in terms of the beta-normal distribution.See also [2] as the good initial source on credit risk management and some other insights deriving inequality (14).Equally, in view of ( 16), we depict the probability density function for some chosen parameters in Figure 1 and the cumulative distribution function F α, β, (y) itself correspondingly in Figure 2 below.To estimate the probability of default p by ( 4) or (15) among the portfolio sub-classes A 1 , A 2 , . . ., A l , where A 1 represents the lowest risk borrowers and A l the highest respectively, there was proposed a method of conservatism; see [14].The method of conservatism states the following.Let n 1 , n 2 , . . ., n l , k 1 , k 2 , . . ., k l and p 1 , p 2 , . . ., p l be the number of obligors, the number of expected defaults and default probabilities over the portfolio sub-classes A 1 , A 2 , . . ., A l respectively.Then n 1 + n 2 . . .+ n l = n, k 1 + k 2 + . . .+ k l = k and the probability of defaults p 1 should be estimated using the parameters (n, k) in ( 4) or (15), p 2 should be estimated using ) and so on up to p l which should be estimated using (n l , k l ).
Discussions and dissatisfaction among the practitioners that the estimates (4) or (15) of the probability of default are too conservative, force some adjustments to estimate p conditionally (biased), see [16] and related papers.

Statements
In this section, we recall the connection between the binomial and beta distributions, provide several equivalent forms of inequality ( 14) and its connection to the normal multivariate distribution when there are no expected defaults, i.e. k = 0. Proposition 1.Let n ∈ N, k ∈ {0, 1, . . ., n − 1} be fixed and p ∈ (0, 1).Then the cumulative distribution function of binomial and beta random variables are related as Note 1: Let us emphasize that the function Bin n,p (k) in Proposition 1 is understood as the function of p ∈ (0, 1), when n and k are fixed.
Then the inequality (14) admits the following equivalent representations: where B n−k, k+1 (•) is the cumulative distribution function of the beta random variable.
Corollary 4. If k = 0, n ∈ N and X ∼ N (0, 1), then the left hand-side of the inequality ( 14) is where Φ R is the Gaussian copula with the correlation matrix On top of that, the multivariate density of Φ R in (24) is Corollary 4 and its proof (see Section 4) implies where X ∼ N (0, 1), and these moments of Vasicek distribution ( 12) are connected to the moments of the probability distribution given by where n ∈ N and i ∈ {0, 1, . . ., n}, see (13).Indeed, due to the well-known moment-generating function of the binomial distribution, the moment-generating function M (t) of ( 26) is where X ∼ N (0, 1).Notice that M (log t) is the probability-generating function of the underlying distribution.

Proofs
This section provides the proofs for three formulated statements in Section 3. Majority of the given proofs are commonly known among researchers or scholars and there is difficult to give any initial source.
Proof of Proposition 1.Let us first show that

Indeed,
We now aim to prove where Bin n,p (k) is considered as a function of p ∈ (0, 1) when k and n are fixed.Let us rewrite One may observe that f (0) = 0, f (1) = 1 and the derivative is positive for all p ∈ (0, 1).Thus, f (p) is the cumulative distribution function over the interval p ∈ (0, 1) and its derivative is nothing but the density of the beta distribution with parameters (k + 1, n − k), i.e., Proof of Proposition 2. The integral in ( 14) implies ( 18) by Proposition 1, while ( 18) implies ( 19) by the change of variable Φ(x) → x.
Proof of Proposition 3. The probability (20) is implied by (18) observing that when X and Y are independent.The remaining probabilities (21) and ( 22) are implied by the integral in (19) by the same arguments.
Proof of Corollary 4. Let a, b ∈ R. Assume the random variables Y 1 , . . ., Y n are independent and identically distributed by N (0, 1).If X ∼ N (0, 1) and Y 1 , . . ., Y n are conditionally independent of X, then The equality (23) follows by choosing while the equality (24) is implied observing that where The determinant of R is and the inverse matrix of R admits the following representation Indeed, it is easy to check that RR −1 = I, where I is the identity matrix.

Examples of computation
In this section, we give two examples that illustrate the discussed estimation of default probability p.The required computations are performed with program [12].
Example 5. Suppose there are up to 3 defaults expected with probability 1 − γ out of 800 obligors which are split into three risk classes: A, B and C, where A represents the lowest risk and C the highest.Assume the numbers of obligors are 100, 400, 300 and the numbers of expected defaults are up to 0, 2, 1 in risk classes A, B and C respectively.We apply Propositions 1, 2 and the method of conservatism introduced in [14] to estimate the probabilities of default p A , p B and p C in risk classes A, B and C.
The method of conservatism (see [14] and the description by end of Section 2) states that p A should be estimated for the entire portfolio, i.e., n = 800 and k = 3 in the considered case.The probability p B should be estimated for the entire portfolio excluding the class A, i.e., n = 700 and k = 3 in the considered case.Then, the probability p C is estimated using n = 300 and k = 1 as per the riskiest class C.
Using Proposition 1, the underlying logic stated in subsection 2.1 and the method of conservatism, we obtain Table 1 Note that the numbers in Table 1 are given in [14] too and we replicate them for comparison purposes, especially calculating the quantiles of the underlying distribution given by F n−k, k+1, (y).
Suppose the asset correlation = 12% in Example 5.Then, using Proposition 2, the underlying logic stated in subsection 2.2, the method of conservatism and the function "FindRoot" in progam [12] we obtain Table 2 The provided numbers in Table 3 match the corresponding ones in [14] except few cases caused by rounding errors in the fourth decimal place.Example 6. Suppose there are up to 7 defaults expected with probability 1 − γ out of 1500 obligors which are split in four risk classes: A, B, C and D where A represents the lowest risk and D the highest.Assume the numbers of obligors are 400, 700, 250, 150 and the numbers of expected defaults are up to 2, 1, 3, 1 in risk classes A, B, C and D respectively.We apply Propositions 1, 2 and the method of conservatism introduced in [14] to estimate the probabilities of default p A , p B p C and p D in risk classes A, B, C and D.
Using Proposition 1, the underlying logic stated in subsection 2.1 and the method of conservatism, we obtain Table 4 ]) "... this is not a desirable effect, a possible -conservative -work-around could be to increment the number of defaults in grade D up to the point where p D would take on a greater value than p C ...".

Concluding remarks
As stated, this survey article gives a detailed probabilistic overview of two methods for the upper bound of default probability.The provided insights reveal the important role played by the beta-normal distribution.However, the beta-normal distribution appears to be little studied, compared to the voluminous literature for the separate normal or beta distributions.It would be of interest to get any closed-form of the inverse of F α, β, (p) (see (16)) in terms of a superposition of Φ −1 µ, σ 2 (•) and B −1 α, β (•), which possibly would include studying the cumulative distribution function Φ µ, σ 2 aΦ −1 μ, σ2 (x) + b when a, b ∈ R and x ∈ (0, 1).

Acknowledgments
The author is thankful to Arvydas Karbonskis for his feedback on the draft version of this article and also to Dirk Tasche for pointing to the reference [16] and giving several other valuable comments.

Table 1 :
. The upper bounds of p A , p B and p C .

Table 3 .
and

Table 2 :
The quantiles of distribution which cumulative distribution function is F n−k, k+1, (y).

Table 3 :
The upper bounds of p A , p B and p C under the influence of systematic factor.Here a .

Table 4 :
The upper bounds of p A , p B , p C and p D .

Table 5 :
The quantiles of distribution which cumulative distribution function is F n−k, k+1, (y).