A Generalized Log-Weibull Distribution with Bio-Medical Applications

: Here we consider a generalized log-transformed version of the Weibull distribution and investigate some of its important properties like expressions for the cumulative distribution function hazard rate function, quantile function, characteristic function, raw moments, incomplete moments, etc. The distribution and moments of order statistics are obtained along with some results on certain structural properties of the distribution. The maximum likelihood estimation of the parameters of the distribution is attempted for both complete and censored data sets and the usefulness of the distribution is illustrated with the help of real-life data sets from biomedical fields.


INTRODUCTION
The generalized versions of Weibull distribution (WD) and their applications have gained much importance in fields of engineering, ecology, medicine, pharmacy, etc. due to the flexibility of these distributions in handling survival data. Many such modifications of the WD were considered by various researchers including [1][2][3][4][5], etc. An exponentiated version of the WD having the name 'the exponentiated Weibull distribution' was introduced by [6] to incorporate the bath-tub shape for its hazard rate function through the cumulative distribution function (c.d.f.) for any y > 0, with scale parameter σ > 0 and shape parameters β > 0, δ > 0. A distribution with c.d.f. (1) is hereby denoted as EWD(σ, β, δ). Various properties of the distribution were further investigated in detail by several researchers like [7][8][9], etc. More recently, [10] considered certain properties of a log-transformed version of the one parametric Weibull distribution capable of dealing with truncated data sets using the name, 'the log-Weibull distribution (LWD)' with c.d.f.
for y > 1 and shape parameter c > 0. The distribution with c.d.f. (2) is hereafter referred to as the LWD(c).
In this paper, we consider a generalized version of the LWD(c) using the name 'the generalized log-Weibull distribution (GLWD)'. It is seen that the GLWD possess five distinct shapes for its hazard rate *Address correspondence to this author at the Department of Statistics, University of Kerala, Thiruvananthapuram, India; Tel: +919074108366; E-mail: drcsatheeshkumar@gmail.com function including most of the monotone as well as non-monotone hazard rate shapes such as increasing, decreasing, bathtub, upside-down bathtub, and 'S' shapes and is more flexible in terms of its measures of central tendency, dispersion, skewness, and kurtosis, which highlights the utility of the model as a lifetime distribution. As such the proposed distribution can be considered to be a better alternative for many of the recently developed modifications of the WD. The paper is organized as follows: In section 2 we introduce the GLWD and discuss its important properties. In section 3, some structural properties of the distribution are dealt with and Section 4 derives the distribution and moments of the order statistics of the GLWD. Section 5 contains the maximum likelihood estimation of the parameters of the distribution for complete and censored cases. In section 6, the usefulness of the model is illustrated with the help of both complete and censored real-life data sets. and A practical interpretation of the GLWD α, β can be provided whenever β is an integer. Consider a device that is constituted of independent and identically distributed components having the LWD(α) with c.d.f.
Clearly, in the light of (3), the lifetime of the device has the GLWD(α, β).
The plots of the p.d.f., the c.d.f. and the hazard rate function of the GLWD(α, β) for particular values of its parameters are presented in Figures 1, 2, and 3 respectively. Based on Figures 1, 2, and 3 we have the following observations regarding the shapes of the c.d.f., p.d.f. and hazard rate function of the GLWD(α, β)

•
From Figure 1 it can be observed that the p.d.f. of the GLWD(α, β) is a decreasing function for values of α and β such that αβ < 1 whereas the p.d.f. is unimodal when αβ> 1.
• Figure 2 reveals that the c.d.f. F Y (.) of the GLWD(α, β) coincides at the point (e, 0.63212 β ) for fixed values of the parameter β and various values of α.

•
From Figure 3, it is clear that the hazard rate function takes five different shapes including decreasing, increasing, bathtub shape, S-shape, and upside-down bathtub shape depending on the values of the parameters α and β, which illustrates the flexibility of the distribution to model data sets.
On inverting the c.d.f. ! ! (! ! ) of the GLWD(α, β), we obtain the expression for the quantile function y p in which p is a Uniform(0,1) random variable as On substituting p = 0.5 in (7), we obtain the median (M) of the GLWD(α, β) as Also on differentiating the p.d.f. (4) for y, we have The mode of the GLWD(α, β) is obtained from (8), as the solution of the equation !′ ! ! = 0 , which reduces to Using the condition for unimodality, it can be The following observations can be made on the values of the modes of the GLWD(α, β) for various values of α and β.
In this case, the p.d.f. of the GLWD(α, β) is a decreasing function of y.
In this case the p.d.f. is unimodal in shape and the mode (M o ) is obtained from (9) as the solution of the equation • When α= 1 and β>1 such that αβ>1.
In both these cases, the p.d.f. is a decreasing function of y.
In this case, the p.d.f. of the GLWD(α, β) is unimodal and M o is obtained from (9) as the solution Moreover, differentiating the reversed hazard rate function (6) concerning y, we have Based on (10) we have the following result on the log-concavity of the c.d.f. of the GLWD(α, β), the proof of which is straightforward and hence omitted.
We have plotted the values of the median and mode of the GLWD(α, β) for arbitrary values of β and particular values of the parameter α (Figure 4). Now we present certain series representations and integrals which we require in the sequel for deriving expressions for the characteristic function and moments of the GLWD(α, β).
Also, for any a ∈!, The incomplete Gamma function γ(λ, z) is defined as The expressions of the characteristic function and the r th raw moment of the GLWD(α, β) are obtained through the following results, the proofs of which are provided in Appendix A.

Result 2.3
The r th raw moment of the !"#$(!, !) is given by where and N is the set of natural numbers.
The expressions for µ r and ! Y (t) is seen to converge for all values of r ∈ N as well as the parameters of the GLWD(α, β). It has also been verified that the values of the raw moments of the GLWD can be calculated numerically using statistical packages like MATHEMATICA and MATHCAD. We derive the expression for the r th incomplete raw moment of the GLWD(α, β) through the following result.

Result 2.4
The expression for the r th incomplete raw moment of the GLWD(α, β) can be obtained as in which γ(λ, z) is the incomplete Gamma function as defined in (13).
We obtain the expressions for the mean deviations about the mean and the median of the GLWD(α, β) with c.d.f. F Y (.), in terms of its r th incomplete moment, ∆ r (.) can be obtained as (16), when r =1. Now we present certain expressions for the percentile measures of skewness and kurtosis of the GLWD(α, β) . Percentile measures of skewness and kurtosis of distribution are less affected by the tail behaviour of the distribution or by outliers (see [12]) and find use in cases where the moment measures are infinite. Galton's and Bowley's measures of skewness, S G, and S B are defined as (19) and while the expression for the Schmid -Trede measure of kurtosis L is , in which y p is the p th quantile of the GLWD(α, β) as given in (7). The percentile measures of skewness and kurtosis of the GLWD(α, β) are obtained through the following results, the proofs of which follow directly from (19), (20), and (21) respectively, in the light of (9). We have also calculated the values of S G , S B, and L of the GLWD(α, β) for particular values of its parameters and plotted them in Figure 5.

Result 2.5
The Galton's and Bowley's percentile measures of skewness, denoted by S G and S B respectively, of the GLWD are given by

Result 2.6
For the !"#$(!, !), the Schmid-Trede percentile measure of kurtosis (L) is given by In the light of Results 2.5 and 2.6, we have the following observations.

•
The GLWD is symmetric when The following result gives an expression for the geometric mean (GM) of the GLWD(α, β), the proof of which follows directly from the definition of the GM as ln(GM)=E[ln(Y)].

Result 2.7
The GM of the !"#$(!, !) is The stress-strength reliability concept, initially considered by [13] is used for describing the life of a component having strength Y 2 , subjected to a stress Y 1 , where both Y 1 and Y 2 are random variables. The component fails if !1 > !2 and will survive otherwise. The stress-strength reliability measure R, defined as is the probability that a randomly selected device functions successfully and is a measure of component reliability, having significant applications in areas of engineering, genetics, psychology, physics, and economics. We obtain expressions for the stress-strength reliability measure R of the GLWD(α, β) for fixed values of its parameters through the following results.

Result 2.8
Let Y i be a random variable following the GLWD(α i , β i ), for i=1,2; with p.d.f. f Y (.) as defined in (4). Then the stress-strength reliability measure R is the following, in which β(p, q) is the Beta function.
Proof. By definition, From Result 2.8 and Result 2.9, it can be observed that the system reliability between two variables following the GLWD depends only on the values of those parameters that vary between the two variables.
We derive an expression for the mean residual life (MRL) function of the GLWD(α, β) through the following result.

Proof. By definition,
Now I ! can be written as I ! = yF ! y − F ! t dt ! ! = yF ! y − υ ! y , in which υ ! y can be evaluated using the binomial expansion or (12) depending on whether β is an integer or a real number, respectively. Evaluating (26) using expressions for I 1 and I 2 gives (25).

SOME STRUCTURAL PROPERTIES
In this section, we present some structural properties of the GLWD(α, β) through the following results. The proofs of Results 3.4 to 3.6 are omitted as they are straightforward and can be obtained directly by the method of transformation of variables.
Using y = 1 + t for extremely small t > 0 in (27), we get On expanding the term ln 1 + t in (28) and discarding the second term onwards, we obtain

DISTRIBUTION AND MOMENTS OF ORDER STATISTICS
Let Y i:n be the i th order statistic based on a random sample Y 1 , Y 2 , ..., Y n of size n of theGLWD(α, β), with p.d.f. f Y (y) = f Y (y; β) as given in (4) and let µ r = µ r (β) be the r th raw moment of the GLWD(α, β) as given in (15). The distribution and moments of Y i:n can now be derived through the following results.
in which ! !" * = ! ! + ! − 1 and Proof. Consider a random sample of size n from an GLWD(α, β) with p.d.f. f Y (y) and c.d.f. F Y (y). Then the p.d.f. of the i th order statistic Y i:n can be defined as By applying the binomial theorem in (30), we have On further simplification, (31) gives which reduces to (29).
As a consequence of Result 4.1, we have the following corollaries.

Corollary 4.2
For y > 1, the largest order statistic Y n:n = max(Y 1 , Y 2 , ..., Y n ) has the GLWD(α,nβ) with Further, the r th raw moment of the i th order statistic Y i:n of the GLWD(α, β) is provided through the following result, the proof of which follows from Results 2.3 and 4.1.

Result 4.2
The r th raw moment of Y i:n is in which r > 0 and ν n:i:k and β ik * are as defined in (29).

ESTIMATION
In this section, we discuss the maximum likelihood estimation of the parameters of the GLWD(α, β) and derive the likelihood equations for complete and right-censored cases. A data set of observations without any missing value is termed as an uncensored/complete set. The likelihood function for a complete data set having X 1 , X 2 , ..., X n is given by Censored data is regularly encountered in survival and reliability analysis as the information regarding the survival time of some of the observations understudy may remain incomplete or unknown. According to [15], censored data sets represent a particular type of missing data. Assume that we have a random sample of n units with true survival times T 1 , T 2 , ..., T n having p.d.f. f(x) and c.d.f. F(x). However, due to right censoring such as staggered entry, loss to follow-up, competing risks (death from other causes) or any combination of these, it might be impossible to observe the survival times in all of these n cases. Thus, a subject can either be observed for its full lifetime or can be censored. Clearly, the observed data are the minimum of the survival time and censoring time for each unit. Assume that C 1 , C 2 , ..., C n are the censoring times of the n units drawn independently of T i , i = 1, 2, ..., n. On each of n units, we observe n random pairs (X i , η i ), in which X i = min(T i , C i ) and for i = 1,2,...,n. Clearly η i , the censorship indicator indicates whether T i is censored or not. Then, the likelihood function for the censored data set is given by

Estimation of Parameters for the GLWD for Complete Data Sets
Consider a random sample constituting of Y 1 , Y 2 , ..., Y n observations taken from the GLWD(α, β) . The log-likelihood function for the vector of parameters Θ = (α,β) is given by Differentiating the log-likelihood function (34) with respect to the parameters α and β respectively, and equating to zero, we obtain the following likelihood equations.

Estimation of Parameters for the GLWD(α, β) for Censored Data Sets
Let r be the number of failures among the n units. Then, using (4) and (3) in (33) the likelihood function of the GLWD(α, β) for censored data set is given by From (36), we obtain the corresponding log-likelihood function as Differentiating (37) with respect to the parameters α and β , we have the following likelihood equations for censored observations.
These likelihood equations may not always provide a unique solution and in such cases the maximum of the likelihood function is obtained in the border of the domain of the parameters. Hence, we have obtained the second order partial derivatives of the log-likelihood function of the GLWD and by using R software it has been verified that the values of the second order partial derivatives are negative for the estimated parametric values for both the distributions.

APPLICATIONS
To illustrate the utility of the GLWD(α, β) as a survival model, we make use of the following three data sets of which the first and second are complete data sets while the third one is a censored data set. All these three data sets arise from biomedical fields.
From Tables 4, 5, and 6, it can be observed that the GLWD(α, β) gives a relatively better fit to both complete and censored data sets as compared to the other distributions, since the values of AIC, BIC, AICc, and CAIC are minimum. Also, from Table 7, it is evident that the parameter β is significant for all three data sets.

CONCLUSION
Through this paper we have considered a generalization of the log-Weibull distribution studied by [10], using the name the 'generalized log-Weibull distribution (GLWD)'. Some important theoretical properties of the distribution were investigated including expressions for its characteristic function, moments, certain reliability measures as well as the distribution and moments of order statistics. The maximum likelihood estimation of its parameters for complete as well as censored data sets was considered and the utility of the model in survival analysis was illustrated using three real-life data sets. Based on the present study it can be concluded that the proposed model has much more flexibility compared to many existing models and is relatively   more suitable in handling censored and complete survival data sets, especially from bio-medical applications.