文章基本信息

标题：GROUP LENDING WITH HETEROGENEOUS TYPES.
作者：Gan, Li ; Hernandez, Manuel A. ; Liu, Yanyan 等
期刊名称：Economic Inquiry
印刷版ISSN：0095-2583
出版年度：2018
期号：April
出版社：Western Economic Association International
摘要：I. INTRODUCTION

Group lending is a common practice in many microfinance programs in developing countries. Given that the poor often lack appropriate financial collateral, group lending programs are intended to provide a feasible way of extending credit to poor people who are usually kept out of traditional banking systems. Group lending allows lending institutions to rely on information advantages among group members, rather than on financial collateral, to mitigate information asymmetries between lenders and borrowers. There is a debate, however, regarding whether these programs are able to achieve and maintain sound repayment performance while simultaneously serving poor borrowers (Armendariz de Aghion and Morduch 2005). It is also frequently argued that the high transaction costs faced by microfinance institutions in screening their clients, processing applications, and collecting repayments keep interest rates high and prevent the programs from further expanding their operations (Armendariz de Aghion and Morduch 2004; Field and Pande 2008; Shankar 2006).

In this context, a large empirical literature explores the different factors (including group characteristics) that determine repayment performance in group lending programs (e.g., Ahlin and Townsend 2007; Cull, Demirguc-Kunt, and Morduch 2007; Hermes, Lensink, and Mehrteab 2005; Paxton, Graham, and Thraen 2000; Sharma and Zeller 1997; Wydick 1999; Zeller 1998). However, studies using observational data are often subject to an endogeneity problem (Hermes and Lensink 2007; Karlan 2007). The groups are typically formed voluntarily based on a set of common characteristics such as risk type, entrepreneurial spirit, solidarity, and trust among group members; these characteristics are generally observed by peers but not by lenders (or econometricians). The unobserved group heterogeneity resulting from this peer screening (or peer selection) process affects repayment performance and potentially correlates with the observed member demographics and proxies for social ties generally used in single-agent models to account for group heterogeneity. (1)

GROUP LENDING WITH HETEROGENEOUS TYPES.

Gan, Li ; Hernandez, Manuel A. ; Liu, Yanyan 等

GROUP LENDING WITH HETEROGENEOUS TYPES.

I. INTRODUCTION

Group lending is a common practice in many microfinance programs in developing countries. Given that the poor often lack appropriate financial collateral, group lending programs are intended to provide a feasible way of extending credit to poor people who are usually kept out of traditional banking systems. Group lending allows lending institutions to rely on information advantages among group members, rather than on financial collateral, to mitigate information asymmetries between lenders and borrowers. There is a debate, however, regarding whether these programs are able to achieve and maintain sound repayment performance while simultaneously serving poor borrowers (Armendariz de Aghion and Morduch 2005). It is also frequently argued that the high transaction costs faced by microfinance institutions in screening their clients, processing applications, and collecting repayments keep interest rates high and prevent the programs from further expanding their operations (Armendariz de Aghion and Morduch 2004; Field and Pande 2008; Shankar 2006).

In this context, a large empirical literature explores the different factors (including group characteristics) that determine repayment performance in group lending programs (e.g., Ahlin and Townsend 2007; Cull, Demirguc-Kunt, and Morduch 2007; Hermes, Lensink, and Mehrteab 2005; Paxton, Graham, and Thraen 2000; Sharma and Zeller 1997; Wydick 1999; Zeller 1998). However, studies using observational data are often subject to an endogeneity problem (Hermes and Lensink 2007; Karlan 2007). The groups are typically formed voluntarily based on a set of common characteristics such as risk type, entrepreneurial spirit, solidarity, and trust among group members; these characteristics are generally observed by peers but not by lenders (or econometricians). The unobserved group heterogeneity resulting from this peer screening (or peer selection) process affects repayment performance and potentially correlates with the observed member demographics and proxies for social ties generally used in single-agent models to account for group heterogeneity. (1)

Similarly, groups may also differ in the effectiveness of peer monitoring (and enforcement) among members; this is also unobserved by lenders and can have direct implications on the repayment performance of group members. (2) The effectiveness of peer monitoring across groups may be further correlated with peer screening because individuals who team up with safe borrowers may exert different effort levels than those who team up with risky borrowers. Ultimately, decisions made within a group, including potential coordinated behavior, also depend on the level of group cohesion. (3)

To control for the unobserved group heterogeneity, some recent studies resort to a (quasi-) randomization of the group formation process in particular settings. For example, Karlan (2007) exploited a unique quasi-random group formation process in a rural town in Peru to identify social connections and finds evidence of successful peer monitoring and enforcement of joint liability loans, particularly among individuals with stronger social ties. Gine et al. (2010) also examined the impact of a variety of group lending schemes on default and investment decisions using a controlled laboratory environment in an urban market in Peru. However, in the majority of group lending programs, only observational data are available. It is thus desirable to develop methods that can better control for the endogeneity issue in observational studies.

This paper intends to fill this gap by proposing and implementing a finite mixture structure to model group members' repayment behavior in the presence of unobserved group heterogeneity. In the proposed mixture structure, the group type summarizes all unobserved individual and group characteristics within the group. Individuals make repayment decisions based on their unobserved group type as well as on observable individual and loan characteristics. Average member characteristics and other group and village characteristics are used, in turn, to identify the group types. The proposed method can help to better account for the endogeneity problem inherent in observational studies, relative to using traditional probabilistic models, although the endogeneity bias is not necessarily fully eliminated. Similarly, the model is more informative because the effect of the factors explaining repayment behavior are allowed to differ by group type and because the model can help to better screen between likely defaulters and nondefaulters. This is critical in the context of micro-lending because better identifying potential group types and offering them differentiated contracts can help reduce information asymmetries and increase microlending. We discuss the model identification and provide evidence supporting the robustness of the model.

Other studies that use mixture specifications to uncover heterogeneous behaviors include the work of Keane and Wolpin (1997) to model heterogeneous endowment ability in career decisions; the work of Knittel and Stango (2003) to assess whether state-mandated price ceilings serve as focal points for tacit collusion among credit card companies; the work of Gan and Hernandez (2013) to examine whether agglomerated hotels have a higher probability of following collusive regimes; and the work of Dong, Gan, and Wang (2015) to evaluate varying neighborhood effects on educational attainment. Most of these studies, however, do not test the identification of the mixture model proposed.

Our model is applied using a rich dataset of 1,110 group loans, which were allocated to a total of 12,833 members from a group lending program in Andhra Pradesh, India. The results provide strong evidence supporting the existence of two group types. We identify a first group whose members are more inclined to fulfill their credit obligations (i.e., a "responsible" group) and a second group whose members are more inclined to default (i.e., an "irresponsible" group). We also find important differences in the marginal effects of the different individual and loan characteristics included in the repayment equation, suggesting that the underlying factors driving default behavior are likely to differ across types. For example, the existence of more stringent repayment schedules and shorter loan terms and the encouragement of a larger group size seem to be more relevant factors among "irresponsible" groups. Finally, the type-varying model shows a higher predictive performance than standard probabilistic models, particularly in the identification of potential "defaulters."

Overall, this paper makes three contributions to the literature of group lending and the literature of mixture models. First, we provide a new approach to better control for the potential endogeneity issue in observational studies that look at performance of group lending programs, relative to standard probabilistic models. Second, we test the identification of the mixture model proposed, which has generally been overlooked in the related empirical literature using mixture structures. Third, we highlight the usefulness of applying a mixture model to screening "better" versus "worse" groups, which helps mitigate information asymmetries faced by lenders. In this regard, our study highlights the suitability of implementing a mixture model in other related settings such as other group-based programs, personal loans, insurance markets, and filing decisions.

The remainder of the paper is organized as follows. Section II presents and discusses in more detail the proposed model to evaluate repayment decisions with unobserved group heterogeneity. Section III describes the group lending data used in the application analysis and reports and discusses the estimation results. Section IV concludes.

II. MODEL

Let the default behavior of individual i in group j be given by

(1) [D.sub.ij] = 1 ([alpha] + [X.sub.ij][[beta].sub.1] + [C.sub.j] [[beta].sub.2] + [T*.sub.j] + [u.sub.ij] > 0)

where [D.sub.ij] is the observed binary outcome (i.e., [D.sub.ij] equals one if the individual defaults [i.e., does not fully repay her loan] and equals zero otherwise), [alpha] is a constant, [X.sub.ij] is a vector of observable individual characteristics, [C.sub.j] is a vector of loan characteristics, [T*.sub.j] is the unobserved group type which is likely correlated with [X.sub.ij] and [C.sub.j], and [u.sub.ij] is an error term. The correlation between [X.sub.ij] and [T*.sub.j] may result, for example, from a proxy for an individual's social ties included in [X.sub.ij] and potentially correlated with the social ties of her peers (who generally live in the same neighborhood), which partly describe [T*.sub.j]; the loan terms [C.sub.j] may also be correlated with the group features that describe [T*.sub.j].

If group heterogeneity is solely based on observables, the observed group characteristics [W.sup.o.sub.j] such as average member characteristics and other group controls, including social ties, would be sufficient to identify the group types, and [W.sup.o.sub.j] could be used as a proxy for [T*.sub.j] to estimate Equation (1) using a standard probabilistic regression (e.g., probit or logit). However, the unobserved group type is more accurately characterized by both observable and unobservable factors such that [T*.sub.j] = [W.sup.o.sub.j][delta] + [W.sup.u.sub.j] + [[epsilon].sub.j], where [W.sup.u.sub.j] is unobserved, [W.sup.o.sub.j] and [W.sup.u.sub.j] are potentially correlated, and [[epsilon].sub.j] is an error term. Following the previous example, a proxy for social ties or connections of a group, included in [W.sup.o.sub.j], is likely correlated with the unobserved entrepreneurial spirit or economic opportunities of group members, which are comprised in [W.sup.u.sub.j] and further affect repayment.

Hence, a standard probabilistic regression of Equation (1) with only [W.sup.o.sub.j] in the right-hand side results in an omitted variable bias, as [W.sup.u.sub.j] is embedded in the error term. Another option is to incorporate the unobserved group component or type as fixed effects in a conditional logit model. However, a fixed-effect logistic regression mainly exploits within-group variation and will drop all groups without intragroup differences in default behavior. (4) Furthermore, the observed factors affecting repayment performance may vary by group type.

We alternatively propose a finite mixture structure in which the unobserved group heterogeneity can be captured by allowing groups to be of a certain type. We can assume the existence of N different group types but in practice we select the number of types that best fits our data based on different selection criteria. In particular, we considered models between two and three types (i.e., N = 2,3) and find that a two-type specification is preferable to a three-type specification. (5) The two-type model exhibits a lower Schwarz Bayesian Information Criterion (SBIC), while a likelihood ratio test shows that the three-type model does not provide a better fit than the two-type model. (6)

We assume then that [T*.sub.j] can take two possible values: [T*.sub.j] = [T.sup.H.sub.j] or [T*.sub.j] = [T.sup.L.sub.j]. We can think of [T.sup.H.sub.j] as type-H or "responsible" groups and [T.sup.L.sub.j] as type-L or "irresponsible" groups. The repayment behavior of individual i in group j is given by (2)

[mathematical expression not reproducible].

By assuming [T*.sub.j] to be categorical (in this case, take two possible values), the effect of group heterogeneity is absorbed by the constant terms [[alpha].sub.H] and [[alpha].sub.L], while the covariance between [X.sub.ij], [C.sub.j], and [u.sub.ij] is zero. A direct implication of this specification is that the constant terms are different for different types; specifically, [[alpha].sub.H] < [[alpha].sub.L] as the type-H group is regarded as the "responsible" group. The coefficients of the control variables are also allowed to differ across types, which permits us to capture varying effects of different factors on the repayment behavior by type. (7)

Since the type is unobserved, it can only be determined with a probability. We can further assume that the probability of being of a certain group type varies with some observable characteristics [W.sup.o.sub.j]. That is, we can correlate the apparent group types with specific observable characteristics, which can be useful for screening purposes among credit institutions. For example, if [W.sup.o.sub.j] = ([[bar.X].sub.j], [G.sub.j]), the probability of being in type-H group can be modeled as

(3)

Pr ([T*.sub.j] = [T.sup.H.sub.j]) = Pr ([[bar.X].sub.j][[delta].sub.1] + [G.sub.j][[delta].sub.2] + [v.sub.j] > 0)

where [[bar.X].sub.j] is a vector of average (leave-me-out) characteristics of group members, [G.sub.j] is a vector of group and village controls, and [v.sub.j] is an error term. The probability of being in type-L group is, in turn, given by Pr ([T*.sub.j] = [T.sup.L.sub.j]) = 1 -Pr ([T*.sub.j] = [T.sup.H.sub.j]).

Overall, in the proposed specification, the probability of default is conditional on the unobserved group type ([T*.sub.j]) and depends on observable individual and loan characteristics ([X.sub.ij] and [C.sub.j]), while average member characteristics and other group and village characteristics ([[bar.X].sub.j] and [G.sub.j], observed by lenders prior to giving a loan) can help to identify the group type to which individuals belong. Member characteristics may include, for instance, education, asset ownership, housing condition, and occupation, which is information generally disclosed during credit application processes. Standard loan characteristics include loan amount, interest rate, length of loan, and repayment frequency. The other group and village controls used to identify the group type may include group age, number of members, location, and access to programs and services.

Some factors are then directly included in the repayment Equation (2), while other factors indirectly affect the likelihood of repayment through the modeled group type Equation (3). We can still quantify the (indirect) effect of the variables included in the type equation on the probability of repaying. More specifically, we can recover unconditional marginal effects of the variables included in the type equation on the likelihood of repaying, as shown below. Certainly, there can be some discussion regarding which variables should be included in the modeled Equations (2) and (3), which is similar to the discussion when estimating a selection model. In our application, the specification above provides the best fit for the data. (8)

The resulting unconditional probability of default is equal to

(4a) [mathematical expression not reproducible].

Similarly,

(4b) Pr ([D.sub.ij] = 0) = [summation over (K=H,L)] Pr([D.sub.ij] = 0|[T*.sub.j] = [T.sup.K.sub.j]) x Pr ([T*.sub.j] = [T.sup.K.sub.j]).

If the error terms in Equations (2) and (3) have a F(*) and J(*) cumulative distribution function (CDF), the estimated log likelihood for individual i in group j is given by

(5) [mathematical expression not reproducible].

We approximate F(*) and J(*) with logistic CDFs and follow an iterative procedure for the parameters estimation. (9)

The proposed model belongs to the class of finite mixture density models. The identification of these models has been extensively studied in recent years (see Fox and Gandhi 2008; Gan, Huang, and Mayer 2015; Henry, Kitamura, and Salanie 2014; Hu 2008; Lewbel 2007; Mahajan 2006). In particular, Henry, Kitamura, and Salanie (2014) showed that under the following assumptions, the mixture density model with unobserved heterogeneity, such as the one defined above, is nonparametrically identified.

ASSUMPTION 1. (Mixture). The probability of belonging to a certain group type depends on a set of characteristics, which are not all necessarily observable; that is, depends on [W.sub.j] = ([W.sup.o.sub.j], [W.sup.u.sub.j]).

ASSUMPTION 2. (Exclusion Restriction). Conditional on the group type, both observable and unobservable factors that characterize [T*.sub.j] are not related to the probability of defaulting; that is, Pr ([D.sub.ij] = 1|[T*.sub.j] = [T.sup.K.sub.j], [W.sup.o.sub.j], [W.sup.u.sub.j]) = Pr([D.sub.ij] = 1[parallel][T*.sub.j] = [T.sup.K.sub.j])for K = H, L.

The second assumption is the key identifying assumption, which implies that [W.sup.o.sub.j] = ([[bar.X].sub.j], [G.sub.j]) and [W.sup.u.sub.j] are conditionally independent of the errors in Equation (2); that is, [[bar.X].sub.j], [G.sub.j], [W.sup.u.sub.j] [perpendicular to] [u.sub.ij,K] | [X.sub.ij], [C.sub.j], [T*.sub.j] = [T.sup.K.sub.j] for K = H,L. Any association between [W.sub.j] and the probability of default is driven solely by the association between these variables and the probability of being of a certain group type. Mahajan (2006) referred to [W.sub.j] as instrumental-like variables (ILV). (10) Intuitively, the identification is similar to the requirement of instrumental variables in a two-stage least squares (2SLS) procedure, in which the instrumental variable is supposed to be correlated with the unobserved type variable but not correlated with the error term. (11)

Assumption 2 further implies that group heterogeneity in the proposed mixture structure can be fully controlled by only using a partial set of variables in [W.sub.j]. Hence, we require some but not all information about the factors describing group heterogeneity ([T*.sub.j]) to identify the parameters in repayment Equation (2). Following Henry, Kitamura, and Salanie (2014) and Gan, Huang, and Mayer (2015), using the full set of [W.sup.o.sub.j] or a subset of [W.sup.o.sub.j] should produce consistent estimates of the parameters in the filing Equation. A Hausman-type specification test can then be implemented comparing the estimated coefficients in Equation (2) using the full set of [W.sup.o.sub.j] versus the estimates using a subset of [W.sup.o.sub.j]. This test is similar to an over-identification test in an instrumental variables approach. Failing to reject the null hypothesis of no systematic differences between the estimated coefficients provides supporting evidence for the model identification.

Henry, Kitamura, and Salanie (2014) argued that under Assumptions 1 and 2, we can obtain a sharp boundary for both the probability of being of a certain group type (i.e., mixture weights) and the probability of defaulting conditional on your type (i.e., mixture components). Furthermore, point identification can be achieved for the two-type case under Assumptions 1 and 2 when one type dominates in the left tail of the default distribution and the other type dominates in the right tail. (12) This is satisfied in our case by the restriction that the error terms [u.sub.ij, H] and [u.sub.ij,L] in Equation (2) follow the same distribution but [[alpha].sub.H] < [[alpha].sub.L]. We can then formulate the following argument.

ARGUMENT 1. Under Assumptions 1 and 2 and [[alpha].sub.H] < [[alpha].sub.L], the two-type mixture structure summarized in Equations (4) and (5) is uniquely identified.

Appendix A in Appendix SI, Supporting information, presents a simple simulation exercise to better illustrate the advantages of using a mixture structure such as the one described above when evaluating default behavior with heterogeneous agents, compared to a standard logit model. The exercise shows that even in the absence of heterogeneity, a mixture structure can provide both a higher predictive performance and more accurate marginal effects (i.e., the effect of changes in a covariate on the probability of defaulting) than a logit model, although the bias in the marginal effects is not fully eliminated.

III. AN APPLICATION TO A GROUP LENDING PROGRAM IN INDIA

Next, we implement the proposed two-type model using data from a group lending program in India. We first describe the dataset and then present the estimation results.

A. Data

The groups under study are located in the state of Andhra Pradesh in India. (13) They are organized following a recent self-help group (SHG) model promoted by the World Bank, which targets poor women in rural areas and combines savings generation and microlending with social mobilization. In this program, women who generally live in the same village or habitat voluntarily form SHGs. A typical SHG consists of 10-20 members who meet regularly to discuss social issues and activities. During the group meetings, each member also deposits a small thrift payment into a joint bank account. Once enough savings have been accumulated, group members can apply for internal loans that draw from the accumulated savings at an interest rate to be determined by the group. After the group establishes a record of internal savings and repayment, it becomes eligible for loans through a commercial bank or program funds. (14)

The group as a whole, then, borrows from a commercial bank or program funds; all group members are held jointly liable for the debts of the others. The group generally allocates the loan to its members on an equal basis, and the group is not eligible for further loans unless it has made full repayment. (15) In this study, we focus on the first "expired" loan borrowed from commercial banks by each group. An "expired" loan refers to a loan that had passed its due date by the time the survey was conducted.

The working sample includes 1,110 different group loans which were allocated to a total of 12,833 members. The data are from a SHG survey conducted between August and October 2006 in eight districts in Andhra Pradesh, which were chosen to represent the state's three macro-regions (Rayalaseema, Telangana, and Coastal Andhra Pradesh). The SHG survey contains socioeconomic characteristics of group members (households) such as education background, housing condition, land and livestock ownership, occupation, and caste. It also includes group characteristics such as age, meeting frequency of members, and programs and services available within the group. In addition, the survey directly recorded from SHG account books the information regarding all group loans that were taken between June 2003 and June 2006. The information includes the terms of each loan, the group members to whom the loan was allocated, and how much of the loan had been repaid by each member at the time of the survey. (16)

The SHG survey was complemented with a previous village survey that covered all the villages from which the SHGs were sampled. We use this database to construct four indicators to account for the economic environment at the village level, including availability of a financial institution, public bus, telephone, and post office.

Table 1 presents descriptive statistics of the full sample. (17) The top panel (Panel 1) reports member characteristics based on 12,833 observations, while the bottom panel (Panel 2) reports group and loan characteristics based on 1,110 observations. The group characteristics are determined prior to the start of the loan. Approximately 23% of the group members are literate, 31% belong to a scheduled tribe or scheduled caste, and about 65% own some land. About 61% are agricultural laborers who do not own land or own such a small amount of land that they have to provide agricultural labor for others, 20% are self-employed agricultural workers, and the rest have other occupations. We observe that 80% of the group members in our sample fully repaid their loan by its due date (i.e., did not default).

Turning to the group and loan characteristics, the groups range from 7 to 20 members and have close to 13 members on average. In roughly nine of every ten groups, the members meet on a regular basis (at least monthly). About 28% of the groups have a food credit program (in-kind credit for subsidized rice), 15% have a marketing program, and 25% have an insurance program. The average loan size received by a group member is 3,338 rupees (about 67 USD). The annual rate of interest is about 12.8%, which is much lower than the prevailing rate of moneylenders in India. The average duration of a loan is roughly 1 year, and the vast majority of loans required the groups to make repayments at least monthly.

Preliminary Analysis. A first look at the data is indicative of a bimodal repayment distribution. Table 2 shows that in more than nine out of every ten groups in our sample, either all of the members do not default or all of them do default. In particular, in 76% of the groups (848 out of 1,110 groups), all of the group members fully repaid their loans or never defaulted; in another 17% of the groups (188 groups), all of the members defaulted. As discussed earlier, this repayment behavior may result from a combination of unobservable group factors. We can think then of two apparent group types: "responsible" and "irresponsible" groups. (18)

To further examine the possibility of homogeneous sorting among groups, Table B2 of Appendix B in Appendix S1 reports the number of groups in which the intragroup variance is less than or equal to the total variance, considering all groups in the same village and mandal for different borrower characteristics. (19) The characteristics include literacy, household characteristics, land ownership, occupation, and caste. The results show that individuals with similar observable characteristics appear to group together. On average, in 70-72% of the cases, the intragroup variance for a given characteristic is smaller than the intravillage or intramandal variance. There is a relatively higher degree of homogeneity among group members in terms of belonging to a scheduled tribe or caste and being a self-employed agricultural worker.

B. Estimation Results

Table 3 shows the estimation results of the mixture model proposed. The model allows for two group types (type H and type L), and the repayment decision is conditional on the unobserved type, where the marginal effects of the member and loan characteristics may vary by type. The average member characteristics and other group and village controls help, in turn, to identify the group types. (20)

Several important patterns emerge from this table. First, the conditional probability of default is considerably different between the two group types, as reported at the bottom of the table. More specifically, the estimated probability of default conditional on being in a group of type-H individuals is 9.5% versus 62.8% in a group of type-L individuals. Hence, the model clearly distinguishes two group types: one "responsible" type (type H) and another "irresponsible" type (type L). The former group is likely composed of "low-risk" individuals with a strong social cohesion and/or effective enforcement, while the latter group is likely composed of "high-risk" individuals with a weak social cohesion and ineffective enforcement.

Similarly, the average probability of being a type-H group is roughly 80% in our sample; interestingly, groups in which all members pay back their loan exhibit a higher probability of being a type-H group than other groups. (21) In particular, in groups in which none of the members defaulted, the likelihood of being a type-H group is 82.9%; this is compared to 76.4% in groups in which some members defaulted and 66.9% in groups in which all members defaulted. These results further support the model's identification of seemingly "responsible" and "irresponsible" groups.

An analysis of the factors used to describe the probability of being in a type-H group also indicates that "responsible" groups are more likely characterized by women who are literate, own some portion of land, live in semi-pucca houses, participate in agricultural activities, and belong to a scheduled tribe (but not necessarily to a leading caste). (22) Similarly, "responsible" groups are more prone to hold frequent meetings, have a marketing and insurance program but not a food credit program, and have access to additional services in the village such as a financial institution and telephone. This suggests that lenders may want to look for these characteristics when trying to identify potential "responsible" groups and areas in which to operate or expand.

Holding frequent meetings appear to be particularly important. This is in line with Rai and Sjostrom (2004), who emphasized the importance of information sharing to sustain repayment in group lending. It is also in line with other studies that suggest that frequent meetings, in addition to helping peer monitoring and enforcement, may directly increase social contact and reduce lending risks. Feigenberg, Field, and Pande (2013) showed, for instance, that repeated interactions can facilitate cooperation by allowing individuals to sustain reciprocal economic ties; Gine and Karlan (2014) found that groups with stronger social networks are less likely to experience default problems after removing joint liability. (23) The existence of other programs in the group (like marketing and insurance programs) could also stimulate social cooperation and strengthen social ties, in addition to providing additional services to members, thereby increasing risk-sharing among members. (24)

Figure B2 of Appendix B in Appendix SI provides additional support to the correct identification of "responsible" and "irresponsible" groups, based on observed behavior patterns in the data. For example, the probability of being a type-H ("responsible") group is positively correlated with the proportion of literate women in the group; a closer look at the data shows that among groups in which more than half of the women are literate, there is a higher proportion of groups with no members defaulting (82%) and a lower proportion of groups with all members defaulting (13%), compared to groups in which less than half of the women are literate (76% and 17%, respectively). The differences are more pronounced when comparing the distribution of intragroup default behavior between groups with high- and low-meeting frequencies. Among groups that hold at least monthly meetings, which is also distinctive of type-H groups, the proportions of groups with no members defaulting and all members defaulting are 80% and 14%; among groups that hold less than monthly meetings, the corresponding proportions are 48% and 41%. (25) These findings suggest that several of the factors included in the type-probability equation help to identify potential group types and, in particular, that the types in the model are not purely identified by functional form. We further discuss the model identification below.

Conditional Marginal Effects. Another important pattern that emerges from Table 3 is the difference in direction and statistical significance of several of the parameter estimates in the default equation between the two group types. This suggests that the factors driving individual repayment behavior may vary by type. Table 4 shows the conditional marginal effects (evaluated at the sample means) for the different individual and loan characteristics included in the repayment equation after accounting for the group type; that is, the estimated effect of a change in each covariate on the probability of defaulting, conditional on being of a certain group type and keeping all else equal. (26)

We do not observe major changes in the probability of default among type-H group members after a change in most of the individual covariates; being a self-employed agricultural worker and living in pucca house both decrease the probability of default by roughly 3 and 1 percentage point, respectively, while owning some portion of land increases the likelihood of defaulting by less than 1 percentage point. Among type-L group members, in contrast, being a self-employed agricultural worker increases the probability of default by 14 percentage points; being an agricultural laborer also substantially increases the likelihood of defaulting (29 percentage points), as does belonging to a scheduled caste (31 percentage points). Owning some portion of land or living in either pucca or kutcha houses (relative to semi-pucca houses), in turn, decreases the probability of default by 8-16 percentage points.

Regarding the loan covariates, monthly (or higher) repayment frequencies and an additional member receiving a loan decrease the likelihood of defaulting by 3 and 0.2 percentage points, respectively, among type-H group members; among type-L group members, the corresponding decrease is of 26 and 5 percentage points, respectively. An increase in the loan amount, interest rate, and loan duration also results in a much higher increase in the probability of default among type-L group members than among type-H group members.

These varying effects by type can help lenders to better assess their clients and understand the factors driving their behavior. Land ownership, housing conditions, labor activities, and membership in a scheduled tribe seem to matter among type-L groups, in contrast to type-H groups, for which the effects of these factors (if any) are much more limited. The loan characteristics are also more relevant for type-L groups than for type-H groups. These differences can help lending institutions to reduce their transaction costs by offering differentiated contracts based on group types.

Field and Pande (2008), for example, point out the trade-off between higher repayment frequencies (a standard practice among microfinance institutions to encourage fiscal discipline and reduce default risk) and a substantial increase in transaction costs of installment collection. The authors find that switching to lower frequency repayment schedules could allow lenders to significantly reduce their transaction costs with virtually no increase in client default, particularly among first-time borrowers. Our results suggest that the fiscal discipline imposed by frequent repayment is critical among groups suspected (or with a higher probability) of being type-L groups, but is not important for type-H groups, for which less costly repayment schedules could be implemented; the cost savings are likely higher than the (marginal) increase in the default rate in this type of group. Promoting longer term investments through higher loan terms also seems more reasonable among type-H groups, which could improve the borrowers' repayment capacity in the long run (similarly to a more flexible repayment schedule).

Encouraging additional members to receive a loan also seems to be more relevant among groups suspected of being type-L groups. As indicated by Armendariz de Aghion (1999), a larger group size tends to increase peer monitoring and pressure efforts due to joint responsibility, cost-sharing, and commitment effects for debt repayment, although this positive effect could be offset by the increase in the scope of free riding and higher coordination costs in considerably large groups. The results by group type suggest that among type-L groups, the stronger peer monitoring and pressure effects could outweigh the higher coordination costs of having additional members in the group.

Unconditional Marginal Effects. We can also compare the parameter estimates of the type-varying model to those obtained under a standard probabilistic regression. The two models are expected to produce different results, as the mixture model permits us to better account for the inherent (unobserved) group heterogeneity and reduce (but not eliminate) the endogeneity bias. Table 5 reports the unconditional marginal effects (evaluated at the sample means) on the probability of default resulting from the probit, two-type and three-type model. (27) We include the results of the three-type model for comparison with the two-type model. Note that in the type-varying models, the average member characteristics and other group and village characteristics affect the likelihood of default through the probability of being in a particular group type.

Two patterns are worth noting. First, the resulting marginal effects of the two- and three-type models are relatively similar. While this may indicate stability in the estimates when moving to a mixture setup, it can also result from that fact that the predicted probability of the third group type is very close to zero in the three-type specification, which is consistent with the finding that a two-type model provides a better fit. Second, it follows that the probit and type-varying model produces different marginal effects. For example, being an agricultural laborer or belonging to a scheduled caste increases the overall probability of default by roughly 4 percentage points in the two-type model (all else equal), while in the probit model, the change in probability is not significant; a similar pattern is observed for the condition of living in pucca houses or being self-employed agricultural workers, both of which decrease the overall probability of default by 3 and 1 percentage points, respectively, in the type-varying model and are not significant in the probit model. Similarly, monthly (or higher) repayment frequencies will decrease the likelihood of defaulting by 6 percentage points in the two-type model and by 7 percentage points in the probit model, while an additional year in the length of the loan will increase the likelihood of defaulting by 4 percentage points in the two-type model and by more than 8 percentage points in the probit model.

From all models, however, the importance of holding frequent meetings among group members to improve individuals' performance on loan repayments becomes clear. In groups in which members meet at least monthly, the individual probability of default is 30 percentage points lower in the probit model and 45 percentage points lower in the type-varying model than in groups in which members meet less often. Frequent meetings may promote higher social interactions and result in stronger peer monitoring and pressure. Both models also suggest that defaulting is negatively correlated with promoting marketing and insurance programs among group members and positively correlated with subsidized food credit programs, which is also distinctive of poorer groups. (28)

In sum, the results show the importance of having a flexible, type-varying model, which further allows for varying effects by type and provides better insight about the possible factors affecting the members' repayment behavior.

Predictive Performance. We now analyze whether allowing for different group types yields better out-of-sample predictions for the probability of default. We want to examine whether the proposed type-varying model has a higher predictive power than standard probabilistic methods, which can further help to reduce information asymmetries in microlending, especially in the absence of experimental or quasi-experimental settings. To conduct the performance assessment, we follow a standard cross-validation procedure and randomly partition our dataset into a design sample for model estimation (60% of the observations) and a test sample for further analysis (40% of the observations). This exercise permits us to better approximate how the models will perform in practice when using new information sets. The partition is conducted at the group level and both samples maintain the population proportions of default and non-default cases.

Table 6 provides performance indicators for the different models estimated. (29) The indicators include the mean square predicted error and several performance indicators based on the conversion of the estimated default probabilities to a binary regime prediction using the standard 0.5 rule. (30) For the two-type model, the performance assessment is based on two alternative calculations of the probability of default. Generally speaking, a lender could evaluate a potential loan based on the estimated unconditional probability of default or based on the conditional probability of default, depending on the likelihood of being in a certain group type. Hence, different mixtures for estimating the probability of default could be used.

The two approaches considered are:

1. a "naive" approach that only uses the unconditional probability of default, such that

[mathematical expression not reproducible].

2. a "conservative" approach which takes into account the likelihood of being in a type-H group. In particular,

[mathematical expression not reproducible]

where [??]r ([T*.sub.j] = [T.sup.H.sub.J) is the estimated probability of being in a type-H group.

As shown in the table, the "naive" and "conservative" approach report a lower mean squared prediction error than the probit model (0.145 and 0.156 vs. 0.159). The two-type approaches also show a higher overall predictive performance based on McFadden, Puig, and Kirschner's (1977) standard measure. (31) The "naive" approach has a predictive performance of 76.4% and the "conservative" approach has a predictive performance of 76%, compared to 74.7% of the probit model. The poorer performance of the probit model is largely explained by its lower correct default classification rate (i.e., identification of "bad" borrowers): 17.2% versus 21.9% for the "naive" approach and 31.3% for the "conservative" approach. Regarding the correct nondefault classification rate (i.e., identification of "good" borrowers), the probit model performs better than the "conservative" approach but poorer than the "naive" approach.

An alternative way to evaluate the out-of-sample performance consists of examining the number of "good" clients that the model rates as "bad" (Type I error) and the number of "bad" clients that the model rates as "good" (Type II error) for varying cut-off values of the probability of default. In Table 5, we used the standard 0.5 rule for the performance assessment, but a lender may consider alternative threshold rules. Figures 1 and 2 compare the percentage of "good" borrowers rejected and the percentage of "bad" borrowers accepted across the probit, "naive", and "conservative" approaches for different cut-off values.

In the case of Type I errors, the "naive" approach and the probit model outperform the "conservative" approach for most of the cut-off values. More specifically, for cut-off values above 0.1, the lending institution will do better in identifying "good" clients by relying on the "naive" approach or the probit model. In the case of Type II errors, however, both the "naive" and the "conservative" approach outperform the probit model for basically the entire range of cut-off values; for values above 0.3, the "conservative" approach has a considerably higher (and increasing) performance than the "naive" approach. For sufficiently lenient acceptance rules (cut-off values above 0.5), the differences in the percentage of "bad" borrowers accepted between the "conservative" approach and the other models are in the order of 10-23 percentage points.

Overall, we generally attain a higher predictive power when allowing for unobserved group types when modeling the probability of default of group members. The proposed model can thus aid lenders to allocate their resources more efficiently by better identifying and selecting current and future clients (groups). If the lending institution is more interested in reducing its default rates (i.e., by minimizing the number of "bad" clients classified as "good"), the lender should probably follow a "conservative" approach. In contrast, if the lender is more interested in increasing its pool of "good" borrowers (i.e., by identifying "good" clients classified as "bad"), it should follow a "naive" approach, although the probit model will also perform well in this case. However, for more lenient acceptance rules, using a "naive" approach or probit model will also result in a much higher acceptance rate of "bad" clients relative to the "conservative" approach. (32)

Model Identification. Finally, we formally evaluate the identification of the model. As noted previously, a direct implication of the type-varying model is that we require some but not all information about the factors describing group heterogeneity ([T*.sub.j]) to identify the parameters in the main repayment equation. If the model is correctly identified, a partial set of the observable characteristics ([[bar.X].sub.j], [G.sub.j]) used in the type Equation (3) should produce estimated coefficients in the repayment Equation (2) similar to those produced by a full set of these variables.

Table 7 reports the corresponding Hausman test results when comparing our baseline model that includes the full set of variables in [[bar.X].sub.j] and [G.sub.j] versus alternative specifications that exclude some of these variables. We use a Hausmantype specification test because it is a standard test, although we acknowledge that its statistical power may be low in some cases. (33) To make the test more rigorous, we exclude different sets of variables instead of individual variables. The coefficients of both the individual and the loan characteristics, included in the repayment equation, are generally not too sensitive to the exclusion of different sets of variables in the group-type equation. In all cases, there are not major systematic differences (at a 5% level of significance) between the estimated coefficients in the repayment equation across the different models. (34) This exercise supports the robustness of the estimated mixture model.

IV. CONCLUDING REMARKS

This paper proposes and implements a mixture structure to model repayment behavior in group lending with unobserved group heterogeneity. Group-level unobservables may result from a combination of factors, including peer selection and pressure as well as other elements such as social cohesion. In the model, individuals make repayment decisions based on their unobserved group type and observable individual and loan characteristics. Average member characteristics and other group and village characteristics help, in turn, to identify the group types. We also allow the marginal effects in the repayment equation to vary across types. We discuss the model properties and identification and provide evidence supporting the robustness of the model.

We implement the model using data from a group lending program in India. The estimation results support the model specification and show the advantages of relying on a type-varying method when examining the probability of default of group members. First, the model clearly distinguishes two group types: an apparent "responsible" group with a low probability of default and an apparent "irresponsible" group with a high probability of default. Frequent group interactions seem to be the foremost characteristic of "responsible" groups. Second, we find important differences across types in the marginal effects of the different characteristics included in the repayment equation. For example, imposing high-frequency repayment schedules and shorter loan terms and promoting a larger group size appear more appropriate for seemingly "irresponsible" groups. Third, the type-varying model generally shows a higher predictive performance than standard probabilistic models, particularly in the identification potential "defaulters."

The proposed model can attenuate information asymmetries in microlending by helping lenders to better classify their potential clients. In particular, the model can help microfinance institutions to decrease the default rates they face by reducing the inclusion of potential "bad" borrowers and, to a minor extent, by increasing the inclusion of "good" borrowers who are left out in sensitive microcredit markets. In addition, the model can help lenders to better understand the potential factors driving the repayment behavior of different group members. Understanding these different factors can aid lenders in the design of loan contracts for different "types" of clients. By doing so, microfinance providers can allocate resources more efficiently and reduce the high transaction costs they face.

It is worth noting that the analysis has focused on a two-type model, given the nature of the data used in the application. Certainly, there can be a wider set of types in other contexts; the proposed model can be easily adapted to allow for additional types. Considerably increasing the number of types may require though the imposition of restrictions on the value of the coefficients in the repayment equation (e.g., not necessarily allowing for different marginal effects across all types) in order to avoid a highly parameterized model, which could be difficult to estimate in practice. Our analysis also follows a discrete treatment of the repayment decision, given the observed behavior of most of the borrowers in the sample (either full repayment or no payment). However, the model can be modified to examine instead the share of the loan repaid by members. Last, as opposed to several other studies on group repayment, we take advantage of member-level data, which is often difficult to obtain. The proposed model can also be used in a similar manner to model group repayment using group-level data.

ABBREVIATIONS

2SLS: Two-Stage Least Squares

CDF: Cumulative Distribution Function

ILV: Instrumental-Like Variables

NDVI: Normalized Difference Vegetation Index

SBIC: Schwarz Bayesian Information Criterion

SHG: Self-Help Group

SQP: Sequential Quadratic Programming

doi: 10.1111/ecin.12541

REFERENCES

Ahlin C. "Matching for Credit: Risk and Diversification in Thai Microcredit Groups." BREAD Working Paper No. 251, December, 2009.

Ahlin, C., and R. M. Townsend. "Using Repayment Data to Test across Models of Joint Liability Lending." Economic Journal, 117(517), 2007, F11-51.

Armendariz de Aghion, B. "On the Design of a Credit Agreement with Peer Monitoring." Journal of Development Economics, 60(1), 1999, 79-104.

Armendariz de Aghion, B., and J. Morduch. "Microfinance: Where Do We Stand?" in Financial Development and Economic Growth: Explaining the Links, edited by C. Goodhart. Basingstoke: Palgrave Macmillan, 2004.

--. The Economics of Microfinance. Cambridge, MA: MIT Press, 2005.

Banerjee, A., T. Besley, and T. Guinnane. "The Neighbor's Keeper: The Design of a Credit Cooperative with Theory and a Test." Quarterly Journal of Economics, 109(2), 1994, 491-515.

Chowdury, P. R. "Group Lending: Sequential Financing, Lending Monitoring and Joint Liability." Journal of Development Economics, 77(2), 2005, 415-39.

Cull, R., A. Demirguc-Kunt, and J. Morduch. "Financial Performance and Outreach: A Global Analysis of Leading Microbanks." Economic Journal. 117(517), 2007, F107-33.

Dong, Y., L. Gan, and Y. Wang. "Residential Mobility, Neighborhood Effects, and Educational Attainment of Blacks and Whites." Econometric Reviews, 34(6-10), 2015, 763-98.

Fearon, J. D., M. Humphreys, and J. M. Weinstein. "Can Development Aid Contribute to Social Cohesion after Civil War? Evidence from a Field Experiment in Post-Conflict Liberia." American Economic Review, 99(2), 2009, 287-91.

Feigenberg, B., E. Field, and R. Pande. "The Economic Returns to Social Interaction: Experimental Evidence from Microfinance." Review of Economic Studies, 80(4), 2013, 1459-83.

Field, E., and R. Pande. "Repayment Frequency and Default in Microfinance: Evidence from India." Journal of the European Economic Association, 6(2-3), 2008, 501-9.

Fox, J. T., and A. Gandhi. "Identifying Heterogeneity in Economic Choice and Selection Models using Mixture Models." Mimeo, University of Chicago, 2008.

Gan, L., and M. A. Hernandez. "Making Friends with Your Neighbors? Agglomeration and Tacit Collusion in the Lodging Industry." Review of Economics and Statistics, 95(3), 2013, 1002-17.

Gan, L., F. Huang, and A. Mayer. "A Simple Test of Private Information in the Insurance Markets with Heterogeneous Insurance Demand." Economics Letters, 136, 2015, 197-200.

Ghatak, M. "Group Lending, Local Information, and Peer Selection." Journal of Development Economics, 60(1), 1999, 27-50.

--. "Screening by the Company You Keep: Joint Liability Lending and the Peer Selection Effect." Economic Journal, 110(465), 2000, 601-31.

Gine, X., and D. Karlan. "Group versus Individual Liability: Short and Long Term Evidence from Philippine Microcredit Lending Groups." Journal of Development Economics, 107, 2014, 65-83.

Gine, X., P. Jakiela, D. Karlan, and J. Morduch. "Microfinance Games." American Economic Journal: Applied Economics, 2(3), 2010, 60-95.

Henry, M., Y. Kitamura, and B. Salanie. "Partial Identification of Finite Mixtures in Econometric Models." Quantitative Economics, 5(1), 2014, 123-44.

Hermes, N., and R. Lensink. "The Empirics of Microfinance: What Do We Know?" Economic Journal, 117, 2007, 1-10.

Hermes, N., R. Lensink, and H. Mehrteab. "Peer Monitoring, Social Ties and Moral Hazard in Group Lending Programmes: Evidence from Eritrea." World Development, 33(1), 2005, 149-69.

Hu, Y. "Identification and Estimation of Nonlinear Models with Misclassification Error Using Instrumental Variables: A General Solution." Journal of Econometrics, 144(1), 2008, 27-61.

Karlan. D. "Social Connections and Group Banking." Economic Journal, 117, 2007, 52-84.

Keane, M., and K. Wolpin. "The Career Decisions of Young Men." Journal of Political Economy, 105(3), 1997, 473-522.

Knittel, C., and V. Stango. "Price Ceilings as Focal Points for Tacit Collusion: Evidence from Credit Cards." American Economic Review, 93(5), 2003, 1703-29.

Lewbel, A. "Estimation of Average Treatment Effects with Misclassification." Econometrica, 75(2), 2007, 537-51.

Li, S., Y. Liu, and K. Deininger. "How Important Are Endogenous Peer Effects in Group Lending? Estimating a Static Game of Incomplete Information." Journal of Applied Econometrics, 28(5), 2013, 864-82.

Mahajan, A. "Identification and Estimation of Regression Models with Misclassification." Econometrica, 74(3), 2006, 631-65.

McFadden, D., C. Puig, and D. Kirschner. "Determinants of the Long-Run Demand for Electricity." Proceedings of the American Statistical Association (Business and Economics Statistics Section, Part 2), 1977, 109-17.

Paxton, J., D. Graham, and C. Thraen. "Modeling Group Loan Repayment Behavior: New Insights from Burkina Faso." Economic Development and Cultural Change, 48(3), 2000, 639-55.

de Quidt, J., T. Fetzer, and M. Ghatak. "Group Lending without Joint Liability." Journal of Development Economics, 121,2012,217-36.

Rai, A., and T. Sjostrom. "Is Grameen Lending Efficient? Repayment Incentives and Insurance in Village Economies." Review of Economic Studies, 71(1), 2004, 217-34.

Shankar S. "Transaction Costs in Group Micro Credit in India: Case Studies of Three Microfinance Institutions." Centre for Microfinance, Institute for Financial and Management Research Working Paper, August, 2006.

Sharma, M., and M. Zeller. "Repayment Performance in Group-Based Credit Programs in Bangladesh: An Empirical Analysis." World Development, 25(10), 1997, 1731-42.

Stiglitz, J. "Peer Monitoring and Credit Markets." World Bank Economic Review, 4(3), 1990, 351-66.

van Tassel, E. "Group Lending under Asymmetric Information." Journal of Development Economics, 60(1), 1999, 3-25.

Varian, H. "Monitoring Agents with Other Agents." Journal of Institutional and Theoretical Economics, 146, 1990, 153-74.

Wydick, B. "Can Social Cohesion Be Harnessed to Repair Market Failure? Evidence from Group Lending in Guatemala." Economic Journal, 109(457), 1999, 463-75.

Zeller, M. "Determinants of Repayment Performance in Credit Groups: The Role of Program Design, Intragroup Risk Pooling, and Social Cohesion." Economic Development and Cultural Change, 46(3), 1998, 599-620.

SUPPORTING INFORMATION

Additional Supporting Information may be found in the online version of this article:

Appendix A. Exercise using simulated data

Table A1. Model performance using simulated data

Appendix B. Supplementary Tables and Figures

Table B1. Data description

Table B2. Sorting based on observables

Figure B1. Location of villages in Andhra Pradesh and group default behavior

Figure B2. Distribution of intra-group default behavior by different group characteristics

(1.) Ghatak (1999, 2000) and van Tassel (1999) showed, for example, that in a context of individuals with heterogeneous risk types and asymmetric information (where borrowers know each other's type but lenders do not), group lending with joint liability will lead to the formation of relatively homogeneous groups of either safe or risky borrowers (i.e., positive assortative matching or homogeneous sorting). The rationale behind is that while a borrower of any type prefers a safe partner because of lower expected joint liability payments, safe borrowers value safe partners more than risky partners because they repay more often. Ahlin (2009) also found that borrowers will antidiversify risk within groups in order to lower their chances of facing liability for group members. Even in the absence of a joint liability scheme, the unobserved informal risk sharing and social cohesion among members may result in heterogeneous group types with different repayment rates (see de Quidt, Fetzer, and Ghatak 2012; Feigenberg, Field, and Pande 2013; Gine and Karlan 2014).

(2.) Besides mitigating adverse selection through peer screening, group lending helps alleviate moral hazard behavior and enforce repayment because members can more closely monitor each other's use of loans and exert pressure to prevent deliberate default. See Stiglitz (1990), Varian (1990). Banerjee, Besley, and Guinnane (1994), Armendariz de Aghion (1999), and Chowdury (2005).

(3.) For instance, we would expect more correlated defaults in groups with low trust levels among members (i.e., if a partner falls behind in her payments or defaults, it may induce others to do so in a context of low trust), whereas we would expect more nondefaults or full repayments in groups with high trust levels and important peer effects like peer monitoring.

(4.) In our application, this implies dropping more than 90% of the observations.

(5.) We did not consider additional type specifications, as the estimation of models with more than three types present convergence issues with our working sample.

(6.) The estimated probability of a third group type is also very close to zero, as opposed to the other two types (0.00002 vs. 0.80686 and 0.19312). Further details are available upon request.

(7.) This flexibility is similar to Gan and Hernandez (2013), who allow for varying coefficients across potential collusive and noncollusive regimes when modeling the pricing and occupancy rate behavior of hotels under a switching regression framework.

(8.) Additional details regarding the different model specifications considered are available upon request. We tried including the standard deviation of the members' characteristics (as proxies of group bonding) in the type equation, but the model using only average characteristics provides a better fit. We prefer to limit the number of regressors for convergence purposes.

(9.) We obtain qualitative similar results when using a nor mal CDF. For the optimization process, we use the sequential quadratic programming (SQP) iterative method, which is a medium-scale algorithm.

(10.) Mahajan (2006) studies the identification of regression models with a misclassified binary regressor in a mixture density context; the existence of ILV is one of the key assumptions of his study. ILV are assumed to be independent of the misclassified binary regressor conditional on a set of observed covariates and the true type. A direct implication of this conditional independence is that ILV only affect the modeled outcome through the true type.

(11.) Note also that the parameters in Equation (3) may not be consistently estimated, as [T*.sub.j] is determined by both observable and unobservable factors; however, this does not prevent us from obtaining consistent estimates of the parameters in the repayment Equation (2).

(12.) For the case of more than two types, additional assumptions are required for point identification.

(13.) The data were collected in 2006, before the split of Andhra Pradesh. The "Andhra Pradesh" referred to throughout this study includes the two current states of Andhra Pradesh and Telangana. During the study period, Andhra Pradesh was then the fourth largest state in India by area and the fifth largest by population.

(14.) The process of internal savings and repayments promotes social interaction among members and also helps to further screen individuals as some may leave the group prior to obtaining a formal loan. Groups may also implement nonlending programs such as in-kind credit for subsidized rice, marketing, and insurance programs.

(15.) If some members fail to repay some installments, the other members still have the incentive to repay on time, in hope that the delinquent borrowers will repay their installments on a future date. Naturally, a woman who maintains a good record and ends up in a group in which not all members fulfilled their loan obligations may join another group in the future.

(16.) See Li, Liu, and Deininger (2013) for further details on the survey instrument and data collection.

(17.) A detailed description of the variables used in the analysis is provided in Table B1 of Appendix B in Appendix S1.

(18.) The fact that groups in which all members defaulted in our sample are not concentrated at particular locations also reduces the possibility of specific weather shocks or other contextual factors in specific areas explaining the observed default behavior. Figure B1 of Appendix B in Appendix S1 shows that villages with a high proportion of groups in which all members default are well dispersed across the eight districts analyzed. For areas with available weather data (rainfall) and vegetation information (normalized difference vegetation index or NDVI), we also did not find any significant correlation between these measures and default behavior. In particular, we included these variables as additional regressors in the empirical model for robustness check and found that they are jointly insignificant.

(19.) The comparisons exclude all villages (150 out of457) and mandals (3 out of 97) where there is only one group in the village or mandal. A mandal is the equivalent to a subdistrict in India and comprises several villages.

(20.) As noted earlier, the village and group controls are predetermined before the start of the loan; these variables, however, are still not required to be fully exogenous to identify the group types.

(21.) Recall that in our raw data, we observe full repayment by all members in 76% of the groups, while in another 17% of the groups, all members default.

(22.) A semi-pucca house is characterized by a combination of materials generally found in both pucca and kutcha houses (the other two house types in the sample). A pucca house has walls and roofs made of burnt bricks, stones, cement concrete, and timber, while a kutcha house uses hay, bamboo, mud, and grass.

(23.) Ultimately, frequent meetings may proxy for female empowerment, which could also affect whether the borrowers or group can command household resources for repayment.

(24.) Fearon, Humphreys, and Weinstein (2009) and Feigenberg, Field, and Pande (2013) showed the importance of community development programs in different settings to encourage social cohesion.

(25.) Similar patterns are observed when comparing groups with and without marketing programs and a financial institution in the village, which are also correlated with the likelihood of being a type-H group in the model.

(26.) The normal-based confidence intervals reported for the estimated marginal effects are based on 200 bootstrap replications and are biased-corrected. Although not reported, the bootstrap means are very similar to the estimated marginal effects, which support the bootstrap procedure implemented.

(27.) We use a probit model because it provides a better fit and performance than a logit and a linear probability model. We consider both a probit model that only accounts for member and loan characteristics (simple probit model) and a second model that also adds average member characteristics and other group and village controls (full probit model). For comparison purposes, the confidence intervals of the marginal effects for all models were derived using 200 bootstrap replications.

(28.) The existence of a financial institution and a telephone in the village is also highly correlated with a positive repayment behavior under the two models.

(29.) The results are based on 200 repeated 60%-40% partitions. The results are not sensitive to alternative data partitions (70%-30% and 50%-50%, respectively).

(30.) If the estimated default probability is greater or equal to 0.5, the individual is predicted to default; otherwise, the individual is predicted to not default.

(31.) McFadden, Puig, and Kirschner (1977) overall performance measure is equal to [p.sub.11] + [p.sub.22] - [p.sup.2.sub.12] - [p.sup.2.sub.21], where [p.sub.ij] is the ijth entry (expressed as a fraction of the sum of all entries) in the 2 x 2 confusion matrix of actual versus predicted (0,1) outcomes using the 0.5 rule.

(32.) For example, for a cut-off value of 0.4, the "naive" approach outperforms the "conservative" approach by 3 percentage points in terms of the rejection rate of "good" clients, while the "conservative" approach outperforms the "naive" approach by a similar degree in terms of the acceptance rate of "bad" clients. However, for a cut-off value of 0.6, the "naive" approach outperforms the "conservative" approach by 4 percentage points when identifying "good" clients, while the "conservative" approach outperforms the "naive" approach by 14 percentage points when identifying "bad" clients.

(33.) We evaluated the statistical power of the Hausman test in a context similar to ours using simulated data and find a power of 53%-73% for sample sizes between 1,000 and 15,000 observations. Additional details are available upon request.

(34.) Naturally, the coefficients are less sensitive when excluding individual variables from the type equation. Details are available upon request.

Caption: FIGURE 1 Comparison of Type I Errors

Caption: FIGURE 2 Comparison of Type II Errors

LI GAN, MANUEL A. HERNANDEZ AND YANYAN LIU*

*We thank Alan de Brauw, Arun Chandrasekhar, Hanming Fang, Dean Karlan, Thierry Magnac, Carlos Martins-Filho, Eduardo Nakasone, Salvador Navarro, Petra Todd, Annabel Vanroose, Ruth Vargas-Hill, and seminar participants at the Winter Meetings of the Econometric Society, Latin American Meeting of the Econometric Society, China Meeting of the Econometric Society, Pacific Development Conference, Experimental Methods in Policy Conference, Peruvian Economic Association Annual Conference, IFPRI, GRADE, and Universidad de Piura for their helpful comments. We also thank the staff of the Center for Economics and Social Studies, particularly Prof. S. Galab, for their support and collaboration in making the data available. Similarly, we thank Zhe Guo for his valuable research assistance. Last, we would like to thank Dietrich Vollrath and two anonymous referees for their many useful comments. We gratefully acknowledge financial support from the CGIAR Research Program on Policies, Institutions and Markets and the Private Enterprise Research Center (PERC) of Texas A&M University.

Gan: Professor, Department of Economics, Texas A&M University and NBER, College Station, TX 77843. Phone 979-862-1667, Fax 979-847-8757, E-mail ganli@tamu.edu

Hernandez: Research Fellow, Markets, Trade and Institutions Division, International Food Policy Research Institute--IFPRI, Washington, DC 20006. Phone 202-862-5645, Fax 202-467-4439, E-mail m.a.hernandez@cgiar.org

Liu: Senior Research Fellow, Markets, Trade and Institutions Division, International Food Policy Research Institute--IFPRI, Washington, DC 20006. Phone 202862-4649, Fax 202-467-4439, E-mail y.liu@cgiar.org

TABLE 1 Summary Statistics

Variable                              Mean    Std. Dev.   Min     Max

Panel I: Individual characteristics
(12,883 observations)
  If defaulted                        0.20      0.40      0.00    1.00
  If literate                         0.23      0.42      0.00    1.00
  If disabled member in household     0.06      0.24      0.00    1.00
  If owns land                        0.65      0.48      0.00    1.00
  If lives in pucca house             0.33      0.47      0.00    1.00
  If lives in kutcha house            0.22      0.42      0.00    1.00
  If self-employed agricultural       0.20      0.40      0.00    1.00
  worker
  If agricultural laborer             0.61      0.49      0.00    1.00
  If belongs to scheduled             0.31      0.46      0.00    1.00
  tribe/caste
  If belongs to leading caste         0.92      0.27      0.00    1.00

Panel 2: Group and loan
characteristics (1,110 groups)
Average member characteristics
  % literate                          0.22      0.21      0.00    0.94
  % disabled member in household      0.05      0.10      0.00    0.94
  % own land                          0.59      0.31      0.00    0.95
  % live in pucca house               0.32      0.31      0.00    0.95
  % live in kutcha house              0.21      0.26      0.00    0.95
  % self-employed agricultural        0.18      0.30      0.00    0.95
  worker
  % agricultural laborer              0.56      0.36      0.00    0.95
  % belong to scheduled               0.31      0.43      0.00    1.00
  tribe/caste
  % belong to leading caste           0.91      0.14      0.36    1.00

Other group and village
characteristics
  Age of group (years)                6.44      2.49      1.00   25.00
  If group has food credit program    0.28      0.45      0.00    1.00
  If group has marketing program      0.15      0.35      0.00    1.00
  If group has insurance program      0.25      0.43      0.00    1.00
  If group meets at least monthly     0.89      0.31      0.00    1.00
  If located in Telangana             0.45      0.50      0.00    1.00
  If located in Rayalaseema           0.26      0.44      0.00    1.00
  If located in Coastal Andhra        0.29      0.45      0.00    1.00
  Pradesh
  Number of group members             12.52     2.37      7.00   20.00
  If financial institution in         0.34      0.47      0.00    1.00
  village
  If public bus in village            0.66      0.48      0.00    1.00
  If telephone in village             0.75      0.43      0.00    1.00
  If post office in village           0.63      0.48      0.00    1.00

Loan characteristics
  Amount of loan (rupees)             3,338     2,685     400    25,000
  Number of members with loan         11.61     3.24      2.00   20.00
  Annual interest rate (%)            12.83     3.10      6.00   25.00
  Length of loan (years)              1.11      0.46      0.17    5.00
  If repayment at least monthly       0.96      0.19      0.00    1.00
  If loan due in 2004                 0.11      0.31      0.00    1.00
  If loan due in 2005                 0.49      0.50      0.00    1.00
  If loan due in 2006                 0.40      0.49      0.00    1.00

TABLE 2

Intragroup Default Behavior

                                   Groups

Default Behavior                     #       %

If none of the members defaulted    848    76.4
If all of the members defaulted     188    16.9
If some of the members defaulted    74      6.7
Total                              1,110   100.0

TABLE 3

Probability of Default, Two-Type Model

Variable                                   Type H           Type L
                                        Coefficient       Coefficient
                                        (Std. Error)      (Std. Error)

                                       Dependent variable: If default

Constant                               -3.399 (0.629)   7.775 (28.740)
If literate                            0.160 (0.105)     0.540 (0.206)
If disabled member in household        0.258 (0.163)    -0.263 (0.383)
If owns land                           0.180 (0.119)    -0.556 (0.181)
If lives in pucca house                -0.198 (0.122)   -0.997 (0.186)
If lives in kutcha house               0.022 (0.124)    -0.844 (0.209)
If self-employed agricultural worker   -0.593 (0.184)    1.173 (0.266)
If agricultural laborer                0.120 (0.140)     1.748 (0.155)
If belongs to scheduled tribe/caste    0.082 (0.110)     2.736 (0.279)
If belongs to leading caste            -0.092 (0.163)    0.260 (0.383)
Amount of loan (1,000 rupees)          0.068 (0.016)     0.462 (0.049)
Number of members with loan            -0.062 (0.090)   -0.338 (0.151)
Number of members with loan squared    0.001 (0.004)     0.003 (0.007)
Annual interest rate (%)               0.083 (0.013)     0.277 (0.034)
Length of loan (years)                 0.508 (0.081)     0.963 (0.193)
If repayment at least monthly          -0.497 (0.244)   -10.989 (5.515)
If loan due in 2005                    -1.267 (0.435)   -0.128 (0.287)
If loan due in 2006                    1.052 (0.189)     1.229 (0.286)
Probability of type-H group
Constant                               -2.901 (2.501)
% literate                             1.921 (0.409)
% disabled member in household         1.630 (0.777)
% own land                             0.707 (0.212)
% live in pucca house                  -1.124 (0.276)
% live in kutcha house                 -1.052 (0.228)
% self-employed agricultural worker    0.697 (0.323)
% agricultural laborer                 1.902 (0.318)
% belong to scheduled tribe/caste      0.623 (0.167)
% belong to leading caste              -1.020 (0.496)
Age of group (years)                   0.025 (0.066)
Age of group squared                   -0.004 (0.004)
If group has food credit program       -0.951 (0.115)
If group has marketing program         1.688 (0.277)
If group has insurance program         0.443 (0.139)
If group meets at least monthly        3.105 (0.223)
If located in Telangana                2.320 (0.255)
If located in Rayalaseema              0.652 (0.211)
Number of group members                0.132 (0.360)
Number of group members squared        -0.014 (0.014)
If financial institution in village    0.979 (0.139)
If public bus in village               0.139 (0.117)
If telephone in village                1.076 (0.168)
If post office in village              -0.684 (0.130)
Predicted probability of being
type-H group
  Average                                                    79.8%
  Group, no members defaulting                               82.9%
  Groups, all members defaulting                             66.9%
  Groups, some members defaulting                            76.4%

Predicted individual default
probability
  Average                                                    19.6%
  Conditional on being in type-H                             9.5%
  group
  Conditional on being in type-L                             62.8%
  group
# observations                                              12,883
Log likelihood                                             -5,111.6

TABLE 4

Conditional Marginal Effects (Percentage Points)

                                                  Type H
                                             Marginal Effect

Variable                                     [95% Confidence
                                                Interval]

Individual characteristics
  If literate                              0.84 [-0.14 to 1.81]
  If disabled member in household          1.44 [-0.54 to 3.53]
  If owns land                             0.89 [0.23 to 1.69]
  If lives in pucca house                 -0.97 [-1.91 to-0.06]
  If lives in kutcha house                 0.11 [-0.78 to 1.19]
  If self-employed agricultural worker    -2.57 [-3.91 to-1.19]
  If agricultural laborer                  0.60 [-0.72 to 1.82]
  If belongs to scheduled tribe/caste      0.42 [-0.18 to 1.14]
  If belongs to leading caste             -0.48 [-2.48 to 1.18]

Loan characteristics
  1,000 rupees increase in loan            0.36 [0.22 to 0.50]
  One more member with loan               -0.23 [-0.32 to -0.13]
  1 % increase interest rate               0.44 [0.32 to 0.52]
  One more year in length of loan          3.23 [2.27 to 3.95]
  If repayment at least monthly           -3.08 [-5.08 to-1.11]
  If loan due in 2005                     -6.60 [-8.33 to -4.97]
  If loan due in 2006                      6.03 [4.10 to 7.43]

                                                   Type L
                                              Marginal Effect

Variable                                      [95% Confidence
                                                 Interval]

Individual characteristics
  If literate                               7.33 [2.39 to 11.57]
  If disabled member in household         -4.21 [-24.12 to 11.92]
  If owns land                             -7.87 [-13.13 to-2.19]
  If lives in pucca house                 -16.44 [-21.08 to-9.58]
  If lives in kutcha house                -14.47 [-21.46 to-8.02]
  If self-employed agricultural worker     13.95 [7.65 to 18.10]
  If agricultural laborer                  29.16 [19.65 to 36.86]
  If belongs to scheduled tribe/caste      31.20 [24.78 to 36.05]
  If belongs to leading caste              4.15 [-8.23 to 14.55]

Loan characteristics
  1,000 rupees increase in loan             5.92 [4.08 to 6.88]
  One more member with loan                -4.77 [-7.24 to-1.04]
  1 % increase interest rate                3.77 [2.39 to 4.68]
  One more year in length of loan          10.39 [6.79 to 12.36]
  If repayment at least monthly           -26.28 [-35.23 to-13.69]
  If loan due in 2005                      -1.91 [-6.85 to 4.88]
  If loan due in 2006                      17.05 [12.08 to 20.68]

Note: The marginal effects are calculated at the means of the
covariates. For continuous variables, the corresponding change
is indicated in the table. For discrete variables, the change is
from 0 to 1. The confidence intervals reported are normal-based and
biased-corrected using 200 bootstrap replications.

TABLE 5

Unconditional Marginal Effects (Percentage Points)

                           Probit Model      Full Probit Model
                          Marginal Effect     Marginal Effect
                          [95% Confidence     [95% Confidence
Variable                     Interval]           Interval]

Individual
characteristics
  If literate                  -0.81               -0.18
                          [-2.01 to 0.51]     [-1.84 to 1.58]

  If disabled member           -1.62               -0.04
  in household            [-4.01 to 0.72]     [-3.15 to 3.21]

  If owns land                 -0.84                0.18
                          [-1.71 to 0.27]     [-1.37 to 2.18]

  If lives in pucca            -0.37               -0.73
  house                   [-1.48 to 0.64]     [-2.86 to 1.22]

  If lives in kutcha           2.82                -0.11
  house                   [1.43 to 4.26]      [-2.30 to 2.19]

  If self-employed             -0.37                0.04
  agricultural worker     [-2.09 to 1.02]     [-3.25 to 2.76]

  If agricultural              0.76                 0.59
  laborer                 [-0.67 to 2.02]     [-2.12 to 3.16]

  If belongs to                6.10                -1.98
  scheduled               [5.40 to 6.83]      [-5.35 to 1.17]
  tribe/caste

  If belongs to                3.12                -0.23
  leading caste           [1.06 to 4.76]      [-3.37 to 2.05]

Loan characteristics

  1,000 rupees                 1.60                 1.45
  increase in loan        [1.46 to 1.76]       [1.30 to 1.63]

  One more member              0.01                 0.15
  with loan               [-0.14 to 0.16]     [-0.06 to 0.34]

  1% increase                  1.19                 1.37
  interest rate           [1.13 to 1.26]       [1.30 to 1.45]

  One more year in             7.90                 8.31
  length of loan          [7.47 to 8.26]       [7.90 to 8.69]

  If repayment at             -14.03               -6.78
  least monthly          [-15.83 to-12.55]    [-8.28 to-5.51]

  If loan due in 2005          -6.01               -5.84
                          [-6.59 to-5.36]     [-6.44 to -5.14]

  If loan due in 2006          9.52                10.64
                          [8.90 to 10.18]     [9.97 to 11.35]

Average member
characteristics

  10% increase                                      0.00
  literate                                    [-0.21 to 0.21]

  10% increase                                     -0.94
  disabled member                             [-1.35 to-0.56]

  10% increase own                                 -0.51
  land                                        [-0.74 to -0.33]

  10% increase pucca                               -0.12
  house                                       [-0.33 to 0.12]

  10% increase                                      0.45
  kutcha house                                 [0.20 to 0.68]

  10% increase                                      0.12
  self-employed                               [-0.19 to 0.48]
  agricultural worker

  10% increase                                      0.18
  agricultural                                [-0.11 to 0.47]
  laborer

  10% increase                                      0.75
  scheduled                                    [0.42 to 1.11]
  tribe/caste

  10% increase                                      0.49
  leading caste                                [0.24 to 0.85]

Other group and
village
characteristics

  One more year of                                  1.19
  age of group                                 [1.03 to 1.36]

  If group has food                                 8.08
  credit program                               [7.67 to 8.57]

  If group has                                     -6.12
  marketing program                           [-6.49 to-5.76]

  If group has                                     -5.29
  insurance program                           [-5.75 to -4.88]

  If group meets at                                -30.11
  least monthly                              [-30.88 to -29.49]

  If located in                                    -9.58
  Telangana                                   [-10.03 to-9.13]

  If located in                                    -2.79
  Rayalaseema                                 [-3.32 to -2.28]

  One more member                                  -1.41
  in group                                    [-1.63 to-1.15]

  If financial                                     -6.01
  institution in                              [-6.39 to-5.65]
  village

  If public bus in                                  1.19
  village                                      [0.83 to 1.59]

  If telephone                                     -3.43
  in village                                  [-3.83 to -3.01]

  If post office                                    0.97
  in village                                   [0.66 to 1.34]

                           Two-Type Model      Three-Type Model
                          Marginal Effect      Marginal Effect
                          [95% Confidence      [95% Confidence
Variable                     Interval]            Interval]

Individual
characteristics
  If literate                   1.56                 1.52
                           [0.54 to 2.50]       [0.51 to 2.46]

  If disabled member            0.82                 0.65
  in household            [-1.45 to 2.76]      [-1.62 to 2.60]

  If owns land                 -0.08                -0.02
                          [-0.80 to 0.76]      [-0.74 to 0.82]

  If lives in pucca            -2.68                -2.59
  house                   [-3.67 to-1.50]      [-3.57 to-1.40]

  If lives in kutcha           -1.50                -1.47
  house                   [-2.74 to-0.20]      [-2.70 to-0.16]

  If self-employed             -0.74                -0.81
  agricultural worker     [-2.13 to 0.51]      [-2.20 to 0.44]

  If agricultural               3.76                 3.66
  laborer                  [2.30 to 5.03]       [2.20 to 4.93]

  If belongs to                 3.83                 3.86
  scheduled                [2.38 to 5.33]       [2.41 to 5.36]
  tribe/caste

  If belongs to                 0.03                -0.21
  leading caste            [-1.94to 1.51]      [-2.21 to 1.24]

Loan characteristics

  1,000 rupees                  0.97                 0.94
  increase in loan         [0.77 to 1.11]       [0.73 to 1.07]

  One more member              -0.74                -0.66
  with loan               [-0.95 to -0.37]     [-0.87 to -0.29]

  1% increase                   0.81                 0.77
  interest rate            [0.65 to 0.89]       [0.62 to 0.86]

  One more year in              4.02                 3.84
  length of loan           [3.21 to 4.48]       [3.01 to 4.29]

  If repayment at              -5.65                -5.49
  least monthly           [-7.60 to -3.39]     [-7.43 to -3.22]

  If loan due in 2005          -6.08                -6.01
                          [-7.17 to-4.85]      [-7.10 to-4.78]

  If loan due in 2006           7.25                 6.97
                           [5.55 to 8.39]       [5.27 to 8.11]

Average member
characteristics

  10% increase                 -1.34                -1.35
  literate                [-1.66 to-1.04]      [-1.67 to-1.05]

  10% increase                 -1.15                -1.10
  disabled member         [-1.64 to-0.56]      [-1.59 to-0.50]

  10% increase own             -0.52                -0.54
  land                    [-0.80 to -0.28]     [-0.83 to-0.31]

  10% increase pucca            0.88                 0.87
  house                    [0.60 to 1.13]       [0.59 to 1.12]

  10% increase                  0.82                 0.85
  kutcha house             [0.46 to 1.25]       [0.49 to 1.28]

  10% increase                 -0.51                -0.49
  self-employed           [-0.92 to-0.05]      [-0.90 to -0.03]
  agricultural worker

  10% increase                 -1.33                -1.32
  agricultural            [-1.64 to-1.01]      [-1.63 to-1.00]
  laborer

  10% increase                 -0.46                -0.50
  scheduled               [-0.72 to -0.28]     [-0.76 to -0.33]
  tribe/caste

  10% increase                  0.80                 0.91
  leading caste            [0.29 to 1.53]       [0.40 to 1.64]

Other group and
village
characteristics

  One more year of              0.06                 0.08
  age of group            [-0.21 to 0.37]      [-0.20 to 0.38]

  If group has food             8.46                 9.14
  credit program          [4.94 to 13.33]      [5.64 to 14.02]

  If group has                 -8.36                -8.47
  marketing program       [-9.43 to-7.51]      [-9.55 to -7.62]

  If group has                 -3.07                -3.35
  insurance program       [-4.50 to -2.20]     [-4.79 to -2.49]

  If group meets at            -44.59               -44.95
  least monthly          [-47.40 to -42.51]   [-47.78 to -42.89]

  If located in                -18.01               -18.23
  Telangana              [-22.78 to-13.68]    [-23.03 to-13.93]

  If located in                -4.27                -4.18
  Rayalaseema             [-5.33 to -3.02]     [-5.24 to-2.93]

  One more member               1.27                 1.17
  in group                 [0.60 to 1.73]       [0.49 to 1.62]

  If financial                 -6.59                -6.84
  institution in          [-8.22 to -5.45]     [-8.48 to -5.70]
  village

  If public bus in             -1.06                -0.92
  village                 [-1.72 to-0.12]      [-1.56 to 0.04]

  If telephone                 -9.96                -9.87
  in village              [-11.56 to-8.18]     [-11.47 to-8.09]

  If post office                4.85                 5.10
  in village               [3.89 to 6.31]       [4.14 to 6.56]

Note: The marginal effects are calculated at the means of
the covariates. For continuous variables, the corresponding
change is indicated in the table. For discrete variables, the
change is from 0 to I. The confidence intervals reported are
normal-based and bias-corrected using 200 bootstrap replications.

TABLE 6

Out-of-Sample Performance of Alternative Models

                                   Probit   Two-Type      Two-Type
Indicator                          Model    "Naive"    "Conservative"

                                       Out-of-sample predictive
                                       performance (5,068 obs.)

Mean square predicted error        0.159      0.145            0.156
Predictive performance             74.7%      76.4%            76.0%
Correct default/nondefault         77.9%      79.2%            78.6%
  classification
Correct default classification     17.2%      21.9%            31.3%
  (sensitivity), 1,062 defaults
Correct nondefault                 94.0%      94.4%            91.2%
  classification (specificity),
  4,006 nondefaults

Note: The "naive" approach is based on the unconditional
probability of default of each individual. The "conservative"
approach uses the probability of default based on the probability
of an individual being in a particular group type. The performance
and classification rates are based on converting the estimated
default probabilities to a binary regime prediction using the
standard 0.5 rule. The predictive performance measure is based on
McFadden, Puig, and Kirschner (1977); the measure is equal to
[p.sub.11] + [p.sub.22] - [p.sup.2.sub.12]--[p.sup.2.sub.21], where
[p.sub.ij] is the ijth entry in the standard 2x2 confusion matrix
of actual versus predicted (0,1) outcomes in which the entries are
expressed as a fraction of the sum of all entries. Sensitivity
accounts for the percentage of cases in which individuals
defaulting are also predicted to default, while specificity
measures the percentage of cases in which individuals not
defaulting are also predicted to not default. The results are based
on 200 repeated 60%-40% data partitions (averages reported).

TABLE 7

Hausman Tests: Baseline Model versus
Alternative Specifications

Variables Excluded               [H.sub.0]: Difference in
                                 Coefficients of Repayment
                                 Equation between Baseline
                                   Model and Alternative
                                      Specifications
                                      Not Systematic

Average member characteristics            16.610
                                          (0.165)
Group programs                            12.402
                                          (0.574)
Frequency of group meetings               32.087
                                          (0.076)
Group location                            11.307
                                          (0.662)

Note: Hausman chi-squared statistics reported and p
values in parenthesis.

COPYRIGHT 2018 Western Economic Association International
No portion of this article can be reproduced without the express written permission from the copyright holder.