Why Do Platforms Use Ad Valorem Fees? Evaluating Two Alternative Explanations.
Wang, Zhu
Why Do Platforms Use Ad Valorem Fees? Evaluating Two Alternative Explanations.
Platforms that intermediate transactions between sellers and buyers
have become increasingly important in the economy. People are familiar
with, for example, online marketplaces (such as Amazon and eBay),
payment platforms (such as Visa, MasterCard, and Paypal), and hotel
booking sites (such as Booking.com and Expedia). However, there has been
a great pricing puzzle associated with these platforms in that they
almost universally rely on ad valorem fees, in which cases platforms
charge sellers fees proportional to the transaction value plus sometimes
small per-transaction fees. Given that these platforms do not incur
significant costs that vary with transaction value, it is puzzling why
ad valorem fees are so prevalently used.
In this article, we review two alternative explanations on this
pricing puzzle. One theory, provided by Shy and Wang (2011) and others,
emphasizes the vertical relation between the platform and the sellers.
It is shown that in the case where the platform (i.e., the upstream) and
the sellers (i.e., the downstream) both have market power (i.e.,
so-called "double marginalization") (1), the platform extracts
a higher profit by using a proportional fee than using a per-transaction
fee. Another explanation, offered by Wang and Wright (2017), instead
focuses on the price discrimination angle. The key idea is that for a
platform dealing with transactions of many different goods that vary
widely in their costs and values, ad valorem fees serve as an efficient
form of price discrimination that increases the platform's profit.
While these two explanations provide alternative views, we will show
that they indeed complement each other in explaining the ad valorem fee
puzzle.
Our article contributes to a growing literature on platforms and
their fee structures. In fact, besides the two theories analyzed in this
article, there are additional (competing or complementary) views on ad
valorem platform fees. For example, Loertscher and Niedermayer (2012)
consider a mechanism design approach in an independent private values
setup with privately informed buyers and sellers, in which an
intermediary's optimal fees converge to linear fees as markets
become increasingly thin. Muthers and Wismer (2013) show that if a
platform can commit to proportional fees, this can reduce a hold-up
problem that arises from the platform wanting to compete with sellers
after they have incurred costs to enter the platform. Hagiu and Wright
(forthcoming) provide a theory that ad valorem contracts align the
incentives between upstream firms (principals) and downstream firms
(agents), which allows the principal to achieve the same profits as if
it could observe the demand shocks and control price.
The article is organized as follows. In Section 1, we first lay out
two simple models that each justify one of the two explanations: double
marginalization versus price discrimination. In Section 2, we then study
a generalized model that accommodates both explanations. Our findings
suggest that, in reality, platforms may choose a simple ad valorem fee
schedule that addresses both double marginalization and price
discrimination considerations. In Section 3, we apply the generalized
model to a calibration exercise using data on DVD sales on Amazon and
quantify the relative importance of the two explanations. Finally,
Section 4 offers concluding remarks.
1. TWO ALTERNATIVE EXPLANATIONS
In this section, we lay out two simple models that each highlight
one of the two alternative explanations: double marginalization versus
price discrimination.
Double Marginalization
We first study a model environment similar to Shy and Wang (2011),
where double marginalization motivates the use of ad valorem fees. (2)
Consider that a monopoly seller sells a good on a monopoly platform. The
good is indexed by c, the per-unit cost of the good to the seller, which
is known to everyone in the market. There is a unit mass of buyers, each
of whom wants to purchase one unit of the good. The value of the good to
a buyer is c (1 + b), where b [greater than or equal to] 0 is a
parameter that the buyer draws. (3) We assume that 1 + b is randomly
distributed according to a cumulative distribution function F. Only
buyers know their own b, while F is public information.
For illustrative purposes, we assume that F takes on a simple
Pareto distribution
F (x) = 1 - [x.sup.-[lambda]]. (1)
Accordingly, the number of transactions [Q.sub.c] for the good c is
the measure of buyers who obtain a nonnegative surplus from buying the
good at price [p.sub.c], Pr (c (1 + b) - [p.sub.c] [greater than or
equal to] 0). Therefore, the demand function for good c is
[Q.sub.c]([p.sub.c]) = 1 - F([p.sub.c]/c) =
[([p.sub.c]/c).sup.-[lambda]], (2)
which has the constant elasticity [lambda]. For the monopoly
pricing problem to be well-defined, we require that [lambda] > 1.
The platform incurs a cost of d [greater than or equal to] 0 per
transaction, and it can potentially charge fees to either the buyer side
or the seller side or both. Regardless of which side is charged, the
final price faced by buyers will reflect any fees, and the buyer treats
these the same whether she faces them directly or through sellers. Due
to this standard result on the irrelevance of the incidence of taxes
across the two sides, we can assume without loss of generality that only
the seller side is charged.
In terms of timing, the platform moves first and announces the fee
schedule it would charge the seller. Taking the fee schedule as given,
the seller then decides the price of the good. Finally, buyers make
purchase decisions.
Given the model setup, we are interested in the following question:
If the platform can choose among a per-transaction fee, a proportional
fee, or a mix of both fees, what type of fee schedule would the platform
prefer?
To answer the question, we consider that the platform decides on an
affine fee schedule, T ([p.sub.c]) = [t.sub.0] + [t.sub.1][p.sub.c],
which covers all the possibilities listed above. We assume that the
platform cannot subsidize the seller to operate by setting [t.sub.0]
< 0. Doing so is likely to create an adverse incentive for which the
seller could just collect to but not sell anything real. This imposes
the requirement that [t.sub.0] [greater than or equal to] 0.
The model can be solved backward. Because the platform would make
its fee decision by incorporating the seller's response, we solve
the seller's problem first. The seller, taking the affine fee
schedule ([t.sub.0], [t.sub.1]) charged by the platform as given, would
choose [p.sub.c] to maximize her profit:
[mathematical expression not reproducible],
which implies
[p.sub.c.sup.*] = [lambda] (c + [t.sub.0])/([lambda] - 1)(1 -
[t.sub.1]). (3)
Anticipating the seller's pricing decision [p.sub.c.sup.*],
the platform would then choose [t.sub.0] and [t.sub.1] to solve
[mathematical expression not reproducible]
subject to the constraint [t.sub.0] [greater than or equal to] 0.
We can verify that the constraint t0 [greater than or equal to] 0 is
binding at the maximum, so the optimal affine fee schedule is just a
proportional fee:
[t.sub.0] = 0. [t.sub.1] = c + d([lambda] - 1)/[lambda]c +
d([lambda] - 1). (4)
Given that [lambda] > 1, we know 1 > [t.sub.1] > 0:
This simple model yields several useful findings. First, in the
presence of double marginalization (i.e., when both the platform and the
seller have market power), the platform strictly prefers a proportional
fee to a per-transaction fee. Note that the use of a proportional fee
allows the platform to mitigate, but not eliminate, double
marginalization. In fact, if the seller side has no market power (or the
platform owns the seller), the platform, being the single monopoly in
the market, would earn an even higher profit and would be indifferent
with a proportional fee or a per-transaction fee, as we will show in the
analysis coming next. Second, to implement the optimal proportional fee,
the platform needs to know c unless the marginal cost d of the platform
is zero, in which case the platform has a simple formula [t.sub.1] =
1/[lambda]. Considering that d is typically small in reality, a platform
may use [t.sub.1] = 1/[lambda] as a good proxy even if it has no
knowledge of c.
The model above serves as a simple illustrative example. As shown
in Shy and Wang (2011) and others, the result holds in more general
settings, including the cases where sellers engage in Cournot
competition with or without free entry. (4)
Price Discrimination
In contrast to the double marginalization explanation, we now study
an alternative model proposed by Wang and Wright (2017) where price
discrimination motivates the use of ad valorem fees. In doing so, we
consider the same model setup as above except for two things: (i) a
variety of goods is being sold on the platform, with the costs c
differing widely across goods; and (ii) for each good c, there are
multiple sellers who engage in Bertrand competition, so sellers have no
market power. (5) The rest of the model specification remains
unchanged--for each good c, there is a unit mass of buyers each of whom
wants to purchase one unit of the good. Buyers draw their benefit 1 + b
from a simple Pareto distribution, and as a result sellers face
constant-elasticity demand. The platform considers charging sellers an
affine fee schedule, T ([p.sub.c]) = [t.sub.0] + [t.sub.1][p.sub.c],
subject to the constraint [t.sub.0] [greater than or equal to] 0.
Assume c takes on a finite number of distinct values in the set of
C. The probability distribution of c on C is denoted [g.sub.c], with
[mathematical expression not reproducible] = 1. As before, we solve the
sellers' problem first. For each good c, taking the affine fee
schedule as given, Bertrand sellers compete by setting the lowest
possible price just to break even, so that
[p.sub.c.sup.*] = c + [t.sub.0] + [t.sub.1][p.sub.c.sup.*] [??]
[p.sub.c.sup.*] = c + [t.sub.0]/1 - [t.sub.1].
Anticipating sellers' pricing decisions, the platform would
then choose [t.sub.0] and [t.sub.1] to solve
[mathematical expression not reproducible]. (5)
To derive the solution to (5) intuitively, we first consider the
hypothetical scenario where the platform could perfectly observe the
cost and valuation for each good c and set a different optimal fee
([t.sub.0], [t.sub.1]) for each as follows:
[mathematical expression not reproducible],
which is equivalent to solving
[mathematical expression not reproducible].
The first-order condition implies a unique value of [t.sub.0.sup.*]
+ [ct.sub.1.sup.*]/1 - [t.sub.1.sup.*] such that
[t.sub.0.sup.*] + [ct.sub.1.sup.*]/1 - [t.sub.1.sup.*] = c +
[lambda]d/[lambda] - 1, (6)
which could be potentially consistent with different fee schedules
([t.sub.0.sup.*], [t.sub.1.sup.*]). For example, the optimal fee could
be a pure per-transaction fee or a pure proportional fee, but those fee
schedules have to depend on c. However, one can verify that there is a
unique affine fee
[t.sub.0.sup.*] = d; [t.sub.1.sup.*] = 1/[lambda] (7)
that also satisfies the condition (6), but the fee schedule does
not depend on c. This means that the affine fee (7) maximizes the
platform's overall profit (5) without requiring the platform to
keep track of the goods traded.
This yields several new findings. First, for a given good, when the
cost c is known to the platform and sellers have no market power, the
platform is indifferent between charging a proportional fee and a per
transaction fee. This contrasts our finding above that a proportional
fee is strictly preferred to a per-transaction fee when sellers do have
market power. Second, the platform can maximize profit by implementing
the affine fee (7) without conditioning on c, which is a great
advantage. There are often a large number of goods being traded on a
platform, and the platform may not be able to track each good's
cost and value. In this case, using the affine fee (7) requires no
information of c, so it can be easily used by the platform. This results
in optimal price discrimination in the sense that charging ad valorem
fees (7) allows the platform to achieve the same level of profit that
could be obtained under third-degree price discrimination as if the
platform could perfectly observe the cost and valuation for each good
traded. Finally, note that the optimal affine fee (7) has a
per-transaction term [t.sub.0.sup.*] > 0 only if the platform incurs
a positive marginal cost d; otherwise, a proportional fee [t.sub.1] =
1/[lambda] is optimal. Again, considering that d is typically small in
reality, a simple proportional fee [t.sub.1] = 1/[lambda] can be a good
proxy in practice.
The model is a simple illustrative example. Wang and Wright (2017)
show the result holds broadly, including the demand takes more general
functional forms or involves unobserved random variations.
2. A GENERALIZED ANALYSIS
The two theories noted above provide alternative justifications for
the use of ad valorem fees by platforms. However, these two theories are
not necessarily exclusive to each other. In this section, we provide a
generalized analysis that accommodates both explanations. We show in
reality a platform can choose a simple ad valorem fee that addresses
both double marginalization and price discrimination considerations. The
analysis and results in this section draw heavily from the online
appendix of Wang and Wright (2017).
In the generalized analysis, we consider a variety of different
goods being traded on a platform. We suppose that for each good there
are [n.sub.c] [greater than or equal to] 1 identical quantity-setting
sellers on the platform (i.e., Cournot competitors). This covers
different intensities of seller competition, including the two special
cases discussed in Section 1: when [n.sub.c] = 1, a good is sold by a
monopoly seller; when [n.sub.c] [right arrow] [infinity], sellers are
perfectly competitive. As before, each seller obtains the goods at a
unit cost c and sells them at a retail price [p.sub.c].
On the demand side, we assume as before that the value of good c to
a buyer drawing the benefit parameter b [greater than or equal to] 0 is
c (1 + b). To generalize the analysis, we now consider that 1 + b is
distributed according to the broad family of generalized Pareto
distributions (GPD), of which the simple Pareto distribution is a
special case. Accordingly, the cumulative distribution function F is
defined as
F (x) = 1 - [(1 + [lambda] ([sigma] - 1)(x - 1)).sup.1/1 -
[sigma]], (8)
with [lambda] > 0 being the scale parameter and [sigma] < 2
being the shape parameter. Only buyers know their own b, while F is
public information.
Note that the generalized Pareto distribution implies the demand
functions for sellers on the platform are defined by the class of
demands with constant curvature of inverse demand (6)
[Q.sub.c]([p.sub.c]) = 1 - F ([p.sub.c]/c) = [(1 + [lambda]([sigma]
- 1)([p.sub.c] - c)/c).sup.1/1 - [sigma]]. (9)
The constant [sigma] is the curvature of inverse demand, defined as
the elasticity of the slope of the inverse demand with respect to
quantity. When [sigma] < 1, the support of F is [1, 1 + 1/[lambda] (1
- [sigma])] and it has increasing hazard. Accordingly, the implied
demand functions [Q.sub.c][(.sub.pc]) are log-concave and include the
linear demand function ([sigma]= 0) as a special case. Alternatively,
when 1 < [sigma] < 2, the support of F is [1, [infinity]), and it
has decreasing hazard. The implied demand functions are log-convex and
include the constant elasticity demand function ([sigma] = 1 +
1/[lambda]) as a special case. When a = 1, F captures the left-truncated
exponential distribution F (x) = 1 - [e.sup.-[lambda](x-1)] on the
support [1, [infinity]), with a constant hazard rate [lambda]. This
implies the exponential (or log-linear) demand [mathematical expression
not reproducible].
Taking as given that demand belongs to the generalized Pareto
class, we allow c to take on potentially many different values in
[[c.sub.L], [C.sub.H]], with the set of all such values being denoted C.
The cumulative distribution of c on C is denoted G, and [g.sub.c] is the
probability corresponding to the realization c.
The platform incurs a cost of d [greater than or equal to] 0 per
transaction. Without loss of generality, we assume that the platform
only charges the seller side to maximize its profit.
Below, in Section 2.1, as a benchmark, we first derive the
platform's optimal affine fee in a setting with generalized Pareto
demand and Bertrand sellers (or equivalently, sellers engage in Cournot
competition, but the number of sellers goes to infinity). This extends
the results we derived in Section 1.2, and we name the resulting fee
schedule the "Bertrand affine fee." In this general case, as
in Section 1.2, the Bertrand affine fee achieves optimal price
discrimination given that sellers have no market power. In Section 2.2,
we show that in a setting where sellers have market power and engage in
Cournot competition, the Bertrand affine fee continues to do well.
Particularly, we show that without knowing each good's cost and how
many sellers are competing, the platform can continue to use the
Bertrand affine fee and earn a higher profit than if it knew everything
and set the optimal per-transaction fee for each good. This is because
the Bertrand affine fee now achieves more than price discrimination; it
also mitigates double marginalization. We then derive analytical results
for the case d = 0 and show that while the Bertrand affine fee is not
necessarily the optimal affine fee when sellers have market power, it
can be very close. Therefore, in practice, a platform can implement the
Bertrand affine fee as a good proxy.
Bertrand Affine Fee
We start with deriving the Bertrand affine fee. Consider that the
platform charges sellers the fee schedule T([p.sub.c]). Assuming that
sellers engage in Bertrand competition, the price pc for good c solves
[p.sub.c] = c + T ([p.sub.c]). (10)
Accordingly, the platform's profit is [[PI].sub.c] = (T
([p.sub.c]) - d) [Q.sub.c] ([p.sub.c]) for good c, where [Q.sub.c]
([p.sub.c]) is given by (9). The platform's problem is to choose T
([p.sub.c]) to maximize
[mathematical expression not reproducible]. (11)
In Wang and Wright (2017), it is shown that the optimal fee
schedule is affine, given by
T ([p.sub.c]) = [lambda]d/1 + [lambda] (2 - [sigma]) + [p.sub.c]/1
+ [lambda](2 - [sigma]), (12)
which maximizes (11). (7) Similar to our finding in Section 1.2,
while the affine fee (12) does not condition on c, it achieves optimal
price discrimination. To see this, note that the solution in (12) is
equivalent to the platform charging the optimal per-transaction fee
[T.sub.c.sup.*] = [lambda]d + c/[lambda](2 - [sigma]) (13)
for each different good c, which would be possible if the platform
could identify each good c and set its optimal per-transaction fee
accordingly.
Our result in Section 1.2 is a special case of the Bertrand affine
fee given by (12), with [sigma] = 1 + 1/[lambda]. In the general case,
the platform's optimal affine fee again has a fixed per-transaction
component only if there is a positive cost to the platform of handling
each transaction (i.e., d > 0). Given [lambda] > 0 and [sigma]
< 2, the fee schedule is increasing (higher prices imply higher fees
are paid) but with a slope less than unity (this implies (10) has a
unique solution for any given c > 0). The result in (12) also implies
the platform can maximize its profit without tracking each individual
good c or knowing the distribution G of goods that are traded.
Seller Market Power and Bertrand Affine Fee
We now study the platform's fee setting when sellers do have
market power. We will show in the case of Cournot sellers, the platform
can continue to use the Bertrand affine fee, which not only addresses
the price discrimination, but also mitigates double marginalization. As
a result, it leads to a higher platform profit than using optimal
pertransaction fees.
Optimal per-transaction fees
To start, we consider the problem of a platform with full
information on c (i.e., each good's cost) and nc (i.e., the number
of Cournot sellers) setting an optimal per-transaction fee for each
good.
Suppose the platform charges a per-transaction fee [T.sub.c] for
good c. Let [q.sub.c,i] denote the output sold by seller i for good c.
Each seller i sets [q.sub.c,i] taking the output by competing sellers
[q.sub.c,-i] = [Q.sub.c] - [q.sub.c,i] as given and maximizes its profit
([p.sub.c] - c - [T.sub.c]) [q.sub.c,i]. Assuming F follows the GPD
distribution (8), the total demand for good c is given by (9), which
implies that the inverse demand is
[p.sub.c] = c (1 + [Q.sub.c.sup.1-[sigma]] - 1/[lambda]([sigma] -
1)).
Therefore, an individual seller's profit maximization problem
is
[mathematical expression not reproducible].
The first-order condition for good c is
[mathematical expression not reproducible].
In a symmetric Cournot equilibrium, [q.sub.c,i] = [q.sub.c] for
every seller, so the total sellers' output is [Q.sub.c] =
[n.sub.c][q.sub.c]. We can then rewrite the first-order condition as
c[([n.sub.c][q.sub.c]).sup.1-[sigma]] - c/[lambda]([sigma] - 1) =
c[([n.sub.c][q.sub.c]).sup.1-[sigma]] - [n.sub.c][lambda] + [T.sub.c]
and derive
[Q.sub.c] = [n.sub.c][q.sub.c] = [([cn.sub.c] + [lambda]([sigma] -
1)[T.sub.c][n.sub.c]/[cn.sub.c] - ([sigma] - 1)c).sup.1/1 - [sigma]].
(14)
Accordingly, the price of good c is
[mathematical expression not reproducible]. (15)
The platform takes (14) as given and maximizes its profit by
setting a per-transaction fee for good c as follows
[mathematical expression not reproducible].
The first-order condition implies the optimal per-transaction fee
[T.sub.c.sup.f]:
[T.sub.c.sup.f] = [lambda]d + c/[lambda](2 - [sigma]), (16)
which is the same optimal per-transaction fee that we derive in the
Bertrand seller setting (13). The optimal per-transaction fee does not
depend on the number of sellers and so also holds for a monopoly seller.
Note that to ensure a meaningful solution (i.e. [T.sub.c.sup.f] > d),
it is required that
d ([sigma] - 1) + c/[lambda] > 0. (17)
This is satisfied for the GPD demand specification: When demand is
log-linear or log-convex, the GPD specification requires that [sigma]
[greater than or equal to] 1 so the condition in (17) holds. When demand
is log-concave, the GPD specification requires that [sigma] < 1 and d
< c/[lambda](1 - [sigma]), so the condition in (17) again holds.
Substituting (16) into (14) and (15), we get
[mathematical expression not reproducible], (18)
and
[mathematical expression not reproducible]. (19)
As a result, the platform profit from good c is
[mathematical expression not reproducible].
Comparing Bertrand affine fee and optimal per-transaction fees
We now compare Bertrand affine fee and optimal per-transaction fees
in the Cournot seller setting.
Consider Cournot sellers facing an affine fee schedule T
([p.sub.c]) = [t.sub.0] + [t.sub.1][p.sub.c] for each transaction. With
GPD demand, the sellers' problem is to choose [q.sub.c,i] to
maximize
((1 - [t.sub.1])[p.sub.c] - c - [t.sub.0])[q.sub.c,i], (20)
where
[p.sub.c] = c(1 + [([q.sub.c,-1] + [q.sub.c,i]).sup.1-[sigma]] -
1/[lambda]([sigma] - 1)). (21)
In a symmetric Cournot equilibrium, [q.sub.c,i] = [q.sub.c] for
every seller, so the total sellers' output is [Q.sub.c] =
[n.sub.c][q.sub.c]. The first-order condition then requires
[mathematical expression not reproducible]. (22)
Substituting the Bertrand affine fee from equation (12) into (22)
gives the same price and output for a given c as we found above in (18)
and (19) for the full information case. That is, the price and output
for each good are identical to that implied by the optimal
per-transaction fee (16). However, the per-transaction fee for good c
implied by the Bertrand affine fee is now
[mathematical expression not reproducible],
which is strictly higher than the fee in (16) if and only if the
condition (17) holds. This implies the platform earns a higher profit
using the Bertrand affine fee than if it used the optimal
per-transaction fee for each different good assuming full information.
This result holds for any [n.sub.c] [greater than or equal to] 1 and so
also holds for monopoly sellers.
This result shows that the Bertrand affine fee can be used in this
setting to solve the price discrimination problem. It delivers the same
price and output for each good without using any information on each
good's cost. At the same time, the Bertrand affine fee generates a
higher profit for the platform because it mitigates the double
marginalization problem associated with using the optimal
per-transaction fee for each good, allowing the platform to collect a
higher fee from each good while achieving the same level of final price
and output.
Comparing Bertrand affine fee and optimal affine fee
We have so far shown that Bertrand affine fee profit dominates per
transaction fee when sellers have market power. In this section,
assuming d = 0, we show that the Bertrand affine fee schedule (12) is
indeed very close to the optimal affine fee schedule under Cournot
sellers. (8) Note that given d =0, the Bertrand affine fee (12) implies
the proportional fee schedule
[T.sup.*] ([p.sub.c]) = (1 + 1 + (2 - [sigma])[lambda])[p.sub.c].
(23)
We can then check whether this is the optimal affine fee schedule
under Cournot sellers.
Consider a platform maximizing its profit by using an affine fee
schedule [t.sub.0] + [t.sub.1][p.sub.c]. As before, we assume that the
platform cannot subsidize sellers to operate by setting [t.sub.0] <
0. This imposes the requirement that [t.sub.0] [greater than or equal
to] 0.
Cournot sellers take the platform affine fee schedule T ([p.sub.c])
= [t.sub.0] + [t.sub.1][p.sub.c] as given for each transaction. As shown
above, with a GPD demand, the sellers' problem is given by (20) and
(21), and the first-order condition for seller's profit-maximizing
problem is given by (22).
Anticipating sellers' responses, the platform then solves the
following problem:
[mathematical expression not reproducible]
subject to the constraint [t.sub.0] [greater than or equal to] 0 as
well as the conditions
[p.sub.c] = c (1 + [Q.sub.c.sup.1 - [sigma]] - 1/[lambda]([sigma] -
1)) (24)
and
[mathematical expression not reproducible], (25)
where (24) is given by the GPD demand and (25) is the first-order
condition (22). We can verify that the constraint [t.sub.0] [greater
than or equal to] 0 is binding at the maximum, so the optimal affine fee
schedule is also just a proportional fee schedule. Moreover, given that
[t.sub.0] = 0, [p.sub.c]/c does not depend on c, so the platform can
solve for the optimal [t.sub.1] without knowing the distribution of c.
The first-order condition on [t.sub.1] requires
[mathematical expression not reproducible]. (26)
The optimal proportional fee implied by (26) is in general not
equal to the proportional fee implied by (23), but based on an
examination of some common demand functions, it is very close and so are
the profits, as discussed below.
Consider first the case of constant elasticity demand, where
[sigma] = 1 + 1/[lambda] and [lambda] > 1. In this case, both (23)
and (26) yield [t.sub.1] = 1/[lambda] and so have identical profits.
Thus, in this case, the Bertrand affine fee coincides with the optimal
affine fee schedule. This result confirms our findings in Sections 1.1
and 1.2 that when d =0, the optimal affine fee under double
marginalization (i.e., [t.sub.0] = 0, [t.sub.1] = 1/[lambda]) coincides
with that which achieves optimal price discrimination (which is again to
= 0, [t.sub.1] = 1/[lambda]).
Next, consider the case of exponential demand where [sigma] =1.
Then (26) implies the optimal proportional fee satisfies
[(1 - [t.sub.1]).sup.3] + [lambda](1 - [t.sub.1])([n.sub.c] -
[t.sub.1]) = [[n.sub.c][t.sub.1][lambda].sup.2],
which has a unique solution. In contrast, (23) implies the
proportional fee
[t.sub.1] = 1/1 + [lambda].
The two fees are not exactly equal, but they are very close. For
the empirically meaningful range where the proportional term [t.sub.1]
of the Bertrand affine fee satisfies [t.sub.1] < 50 percent (or
equivalently, [lambda] [greater than or equal to] 1), the Bertrand
affine fee can recover more than 98.5 percent of the profit under the
optimal affine fee schedule when all sellers are monopolists (so
[n.sub.c] = 1 for all c). Moreover, the profit gap between using the
Bertrand affine fee and using the optimal affine fee schedule decreases
monotonically in [n.sub.c], and the two converge as the number of
Cournot sellers gets large.
Finally, consider the case of linear demand where [sigma] = 0. Then
(26) implies that the optimal proportional fee satisfies
[(1 - [t.sub.1]).sup.2] (1 + [lambda]) (1 - [t.sub.1] -
[t.sub.1][lambda]) - [t.sub.1](1 - [t.sub.1])[lambda] (1 + [lambda]) =
[n.sub.c] (2[t.sub.1][[lambda].sup.2] - [lambda](1 - [t.sub.1])),
which has a unique solution. In contrast, (23) implies the
proportional fee
[t.sub.1] = 1 + 1 + 2[lambda].
For the empirically meaningful range where the proportional term ti
of the Bertrand affine fee satisfies [t.sub.1] [less than or equal to]
50 percent (or equivalently, [lambda] [greater than or equal to] 0.5),
the Bertrand affine fee can recover more than 97.5 percent of the profit
under the optimal affine fee schedule when all sellers are monopolists
(so [n.sub.c] = 1 for all c). Again, the profit gap between using the
Bertrand affine fee schedule and using the optimal affine fee decreases
monotonically in [n.sub.c], and the two converge as the number of
Cournot sellers gets large.
The findings in Section 2 are summarized below.
Assume that the demand functions for sellers on the platform belong
to the generalized Pareto class with [lambda] > 0 and [sigma] < 2
and that for each good c there are [n.sub.c] [greater than or equal to]
1 identical sellers that set quantities. Then we have the following
results:
(i) the platform obtains a higher profit using the Bertrand affine
fee than if it sets the optimal per-transaction fee for each good;
(ii) if sellers face constant elasticity demand ([sigma] = 1 +
1/[lambda] and [lambda] > 1) and d = 0, the Bertrand affine fee is
the optimal affine fee schedule;
(iii) if sellers face exponential demand ([sigma] = 1), [lambda]
> 1, and d = 0, the Bertrand affine fee can recover more than 98.5
percent of the profit under the optimal affine fee schedule;
(iv) if sellers face linear demand ([sigma] = 0), [lambda] >
0.5, and d =0, the Bertrand affine fee can recover more than 97.5
percent of the profit under the optimal affine fee schedule.
3. A QUANTITATIVE EXERCISE
Finally, we may consider the general case in which d > 0 and
compare the platform's profit from the Bertrand affine fee (12)
with its profit from the optimal fee schedules, including nonlinear
ones. This exercise was carried out in detail in Wang and Wright (2017),
and we summarize the findings here.
Once we allow for a nonlinear fee schedule, the optimal fee
schedule will depend on the distribution of goods G(c). This is also
true for the optimal affine fee schedule once we allow d > 0.
Therefore, to proceed, one needs to assume some realistic distribution
for c and calculate the profitability of different fee schedules
numerically. Wang and Wright (2017) use the distribution based on
fitting a log-normal distribution to the actual distribution of sales
obtained from sales ranks of DVDs sold on Amazon. (9) It is assumed that
sellers face constant elasticity demand, and d = 1.35 and [sigma] = 1.15
so that the calibrated Bertrand fee schedule matches the actual fee
schedule used by Amazon for DVDs (which is $1.35+15 percent). Sellers
are assumed to be monopolists (i.e., [n.sub.c] = 1). (10)
With these assumptions, it is found that the platform obtains a
profit of 0.383 with a fixed per-transaction fee (i.e., without any
price discrimination). (11) If the platform could observe each different
good sold by the sellers, it could do better setting the per-transaction
fee that is optimal for each good c. This increases its profit by 17.7
percent to 0.457, which represents the gain due to price discrimination.
Moreover, the benefits of price discrimination can be obtained by using
the Bertrand fee schedule, which does not require any information on the
values of c and has the added benefit of mitigating double
marginalization. Indeed, the platform can increase its profit to 0.537,
or a further 16.3 percent, by using the Bertrand fee schedule. Taking
into account that sellers are monopolists and the particular
distribution of c, the platform can increase its profit by a further 1.5
percent by moving to the optimal affine fee schedule.
Finally, Wang and Wright (2017) obtain the platform's profit
for the optimal nonlinear fee schedule, which comes from solving for the
optimal polynomial fee schedule of degree k, starting with k = 1 (the
affine fee schedule) and considering higher and higher k until the
platform's profit no longer increases. Compared with the optimal
affine fee schedule, moving to the optimal nonlinear fee schedule only
increases the platform's profit by a further 1.3 percent. The
results are summarized in Table 1. The table also shows the results from
repeating the exercise with linear demand.
Quantitatively, the results show that the platform loses little
from restricting fee schedules to affine fee schedules or indeed the
Bertrand affine fees. In the constant-elasticity demand case, price
discrimination and double marginalization have similar quantitative
effects on justifying the platform's use of the Bertrand affine
fee: using the Bertrand affine fee increases platform's profit by
33.8 percent compared with using a fixed per-transaction fee, where 17.7
percent comes from price discrimination and 16.3 percent comes from
mitigating double marginalization. In the linear demand case, price
discrimination's effect turns out higher than double
marginalization: using the Bertrand affine fee increases platform's
profit by 49.6 percent compared with using a fixed per-transaction fee,
where 42.4 percent comes from price discrimination and 7.2 percent comes
from mitigating double marginalization.
4. CONCLUSION
In this article, we review two alternative explanations for why
platforms use ad valorem fees: double marginalization versus price
discrimination. Using a generalized framework, we show that the two
theories complement each other in explaining this pricing puzzle, and
their relative importance is quantified in a calibration exercise.
Our findings set the stage for normative analysis. Given that
platforms do not incur significant costs that vary with transaction
prices, there have been policy concerns regarding their use of ad
valorem fees. Using the framework discussed in this article, one could
evaluate the welfare consequences of regulating platforms' use of
ad valorem fees. In fact, Shy and Wang (2011) and Wang and Wright
(forthcoming) have shown that banning platforms' use of ad valorem
fees tends to reduce social welfare in the presence of double
marginalization or price discrimination. Therefore, caution ought to be
taken when policymakers consider intervening in platforms' use of
ad valorem pricing.
REFERENCES
Aguirre, Inaki, Simon Cowan, and John Vickers. 2010. "Monopoly
Price Discrimination and Demand Curvature." American Economic
Review 100 (September): 1601-15.
Bulow, Jeremy, and Paul Pfleiderer. 1983. "A Note on the
Effect of Cost Changes on Prices." Journal of Political Economy 91
(February): 182-85.
Bulow, Jeremy, and Paul Klemperer. 2012. "Regulated Prices,
Rent Seeking, and Consumer Surplus." Journal of Political Economy
120 (February): 160-86.
Foros, Oystein, Hans Jarle Kind, and Greg Shaffer. 2013.
"Turning the Page on Business Formats for Digital Platforms: Does
Apple's Agency Model Soften Competition?" Working Paper.
Gaudin, Germain, and Alexander White. 2014. "On the Antitrust
Economics of the Electronic Books Industry." Dusseldorf Institute
for Competition Economics Discussion Paper 147 (May).
Hagiu, Andrei, and Julian Wright. Forthcoming. "The Optimality
of Ad Valorem Contracts." Management Science.
Johnson, Justin. 2017. "The Agency Model and MFN
Clauses." Review of Economic Studies 84 (July): 1151-85.
Loertscher, Simon, and Andras Niedermayer. 2012. "Fee-setting
Mechanisms: On Optimal Pricing by Intermediaries and Indirect
Taxation." Governance and the Efficiency of Economic Systems
Discussion Paper 434 (October).
Miao, Chun-Hui. 2013. "Do Card Users Benefit from the Use of
Proportional Fees?" Review of Network Economics 12 (September):
323-41.
Muthers, Johannes, and Sebastian Wismer. 2013. "Why Do
Platforms Charge Proportional Fees? Commitment and Seller
Participation." Working Paper.
Shy, Oz, and Zhu Wang. 2011. "Why Do Payment Card Networks
Charge Proportional Fees?" American Economic Review 101 (June):
1575-90.
Wang, Zhu and Julian Wright. 2017. "Ad Valorem Platform Fees,
Indirect Taxes, and Efficient Price Discrimination." RAND Journal
of Economics 48 (Summer): 467-84.
Wang, Zhu, and Julian Wright. Forthcoming. "Should Platforms
Be Allowed to Charge Ad Valorem Fees?" Journal of Industrial
Economics.
Weyl, Glen, and Michal Fabinger. 2013. "Pass-Through as an
Economic Tool: Principles of Incidence under Imperfect
Competition." Journal of Political Economy 121 (June): 528-83.
Research Department, Federal Reserve Bank of Richmond. Email:
zhu.wang@rich.frb.org. I thank Eric LaRose, John Weinberg, Alexander
Wolman, and Russell Wong for helpful comments. The views expressed are
solely those of the author and do not necessarily reflect the views of
the Federal Reserve Bank of Richmond or the Federal Reserve System.
(1) In the industrial organization literature, double
marginalization refers to the phenomenon in which different firms at
different vertical levels in the supply chain (e.g., upstream and
downstream) have their respective market powers and apply their own
markups in prices. For example, consider that a firm with market power
buys an input from another firm that also has market power. The producer
of the input will price above marginal cost when it sells the input to
the other firm, who will then price above marginal cost again when they
sell the final product that uses the input. This means the input is
being marked up above marginal cost twice, which is called double
marginalization.
(2) In a similar vein, several studies (e.g., Foros et al. 2013;
Gaudin and White 2014; and Johnson 2017) have explored the advantages of
the so-called agency model used by mass retailers such as Amazon, where
the retailer lets suppliers (i.e., sellers) set final prices and receive
a share of the revenue, which is equivalent to using a percentage fee.
Like Shy and Wang (2011), they also show that the revenue sharing used
in the agency model has the advantage of mitigating double
marginalization.
(3) A higher c (i.e., higher cost) implies in the model that the
gains from trade are higher in expectation (due to the multiplicative
connection between c and b). One interpretation for this specification,
as shown in Wang and Wright (2017), is that such a platform reduces
trading frictions, and as a result the value to buyers of using the
platform (so that they can avoid the loss of using a less-efficient
trade intermediary) is proportional to the cost or price of the goods
traded. Note that the assumption b > 0 is an innocuous normalization
because consumers whose valuation for a product is less than its cost
can be ignored.
(4) Cournot competition refers to an oligopoly market structure in
which multiple firms producing a homogeneous product compete by choosing
outputs independently and simultaneously. Assuming a fixed number of
Cournot sellers, Shy and Wang (2011) show that the platform earns a
higher profit by using a proportional fee than a per-transaction fee.
Miao (2013) shows that the result continues to hold under free entry of
sellers.
(5) Bertrand competition is a model of competition in which
multiple firms producing a homogeneous product compete by setting prices
simultaneously and consumers want to buy everything from a firm with a
lower price.
(6) This class of demands has been considered by Bulow and
Pfleiderer (1983), Aguirre et al. (2010), Bulow and Klemperer (2012),
and Weyl and Fabinger (2013), among others.
(7) With this model setting, the optimal platform fee schedule is
affine and does not condition on c if and only if the distribution of
buyers' benefits F is the generalized Pareto distribution. See Wang
and Wright (2017) for a detailed proof.
(8) If d > 0 the results will depend on the distribution of c.
We discuss this case in Section 3.
(9) Using a web robot, Wang and Wright (2017) collected data on
every DVD listed under "Movies & TV" on Amazon's
marketplace in January 2014. Given shipping fees are often not included
in the listed price, the focus is on the items where the listed price
included free shipping, resulting in a sample with 191,280 distinct
items. The data collected include the title, unique ASIN number
identifying the DVD, the price, and sales rank of each DVD. Given that
the sale of each DVD is not directly observable, a power law is used to
infer it from the sales rank data, so [Q.sub.c] = a[R.sub.c.sup.-[phi]],
where [Q.sub.c] is the estimated sale of an item c and [R.sub.c] is the
corresponding sales rank. The parameter a does not affect the analysis,
so it is normalized as a = 1. It is assumed [phi] = 1.7, which is the
number suggested by an experimental study on DVD sales on Amazon.
(10) This quantitative exercise evaluates how well the Bertrand
affine fee performs under Cournot sellers. Assuming monopoly sellers is
the most extreme alternative to Bertrand competition, so it provides the
most conservative results.
(11) Note that because the sales of DVDs are inferred from data on
sales ranks with scale normalized, only the relative (but not the
absolute) value of the platform profit is meaningful for comparison.
COPYRIGHT 2018 Federal Reserve Bank of Richmond
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2018 Gale, Cengage Learning. All rights reserved.