摘要
Many diversity, concentration and entropy metrics are analogous, sharing common origins in ecological science, engineering, mathematics, and communications theory. The difference in terminology depends upon the discipline in which the metrics are applied. The forthcoming article defines the concepts of concentration, diversity and entropy. The various indices used to measure these natural phenomena are examined and their origins identified. Selected examples of their application are reviewed. The aim is to provide a background to concentration metrics that have been applied in different fields of study, thus aiding the identification of topics where they might be used for future research. Introduction Diversity and concentration indices are analogous to one another, sharing common origins in ecological science, engineering, mathematics, entropy and communication theory. The only difference between the terms diversity and concentration is that the former is used in conjunction with evenness and equitability when the principles are applied in the ecological sciences, while the latter is used when referencing the same principles applied to research in economics. For example, in the context of an investment portfolio, concentration would be high if the majority of capital was invested in just a few relatively homogenous securities, while concentration would be low if the capital were distributed evenly between many heterogeneous securities. More specifically, stock market concentration refers to the degree to which a few disproportionately large firms dominate the returns of value weighted stock market indices such as the FTSE 100. The opposite situation would be low concentration, or a fragmented market, in which numerous small firms each accounted for a relatively small share of the overall market. In industrial economics, concentration is high when the majority of the output within an economy is accounted for by relatively few industries, or when the majority of output within an industry is accounted for by relatively few firms. In studies of ecology, diversity may refer to the degree to which the biomass in an ecosystem is distributed across many species of flora and fauna, as in a coral reef or tropical rainforest, or is concentrated in just a few species, as in a monoculture of Sitka spruce trees. Examples of diversity metrics include, the Simpson Index, which can be traced to a paper published in Nature in 1949, by Simpson, titled “Measurement of Diversity”. It is almost identical to the Hirschman-Herfindahl Index of industry concentration used in economics and developed independently by both Herfindahl (1950) and Hirschman (1964). The Simpson Index has also been referred to as the Yule Index after the similar measure Yule (1944) devised to characterise the vocabulary used by different authors. Yule (1944) is cited by Simpson (1949) as a key reference for his index, which is a combination of the ideas of Yule (1944) with those of Fisher et al (1943) and Williams (1946). The entropy index of Shannon (1948), evolved from the the theory of communications engineering. Earlier researchers in this field include Nyquist (1924) and Hartley (1928). The objective of their research was to increase the efficiency of telegraphic information transmission. The Shannon Index is discussed further by Shannon & Weaver (1964) and later by Fernholz et al (1998). In ecological texts the index is often referred to as the Shannon-Wiener Index in deference to Wiener (1961) who arrived at a similar index independently, in 1948. In fact, in the 1961 edition of Wiener’s book “Cybernetics”, Wiener cites the statistician Fisher, who is the same Fisher, cited as a key reference in an article by Simpson (1949), which details the Simpson index of diversity. Wiener also cites Shannon (1948) as a key developer of the index. Shannon and Weaver (1963) in turn refer to Wiener in their book that updates and develops some of the ideas outlined in Shannon (1948). Therefore, a relatively small group of statisticians, mathematicians and engineers, exerted a common influence in the development of the Simpson, Shannon and market concentration indices. Without reference to Shannon, Hart (1971) discusses entropy and other measures of concentration in the context of economics and business concentration. Diversity indices are also reviewed by Peet (1975) who cautions against the inappropriate scaling of diversity indices when conducting studies of community ecology. Characteristics of concentration indices At the generic level, absolute measures of concentration take into account both the number of different categories of units in a sample and the distribution of relative weights between these different categories. In an economic context, the units could be individual securities and the categories could define different firms or industries. In contrast, inequality measures of concentration only take into account the dispersion in the distribution of the weights between different categories and not the number of categories Clarke (1993). When comparing portfolios that have different numbers of constituents, or when the number of constituents in a portfolio or market is changing over time, there is an argument in favour of using absolute measures of concentration. An inequality measure of concentration allows comparison between samples of the same numerical size but when individual units account for different proportions of value or mass of the total sample. Concentration Curve The concentration curve is an absolute measure of concentration in which firm size inequality is represented by the convexity of the curve while firm numbers are indicated by the intersection of the curve at the 100% weight Clarke (1993), a hypothetical example is provided by Figure 1, which is recreated from Clarke (1993). The cumulative percentage weight is plotted on the y-axis against the number of firms starting from the largest on the x-axis. When concentration is at its lower limit, with weights equal for all firms, the line is straight. As concentration increases, the curve becomes more convex and moves further from the straight line, so that the shaded area in Figure 1 becomes larger. A curve that is very steep initially, and then flattens, represents a portfolio that is dominated by one or two large firms, but still contains many small firms that are, approximately, equal in size. On the other hand, if the concentration curve is smoother, large firms may still dominate, but the decrease in firm size, from largest to smallest is more continuous. Different concentration metrics place emphasis upon different parts of the concentration curve.