This glossary of data envelopment analysis terms is intended to help
new users of Frontier Analyst and DEA to understand data envelopment analysis
and the terms associated with it. This glossary is not intended as an
exhaustive list, but deals with the key terms used in DEA. The glossary
is presented in alphabetical order.
To the best of our knowledge, the information given here is correct,
but please let us know if you find anything in this glossary which you
think is incorrect. Please email corrections to info@banxia.com. Thank
you.
| Aggregate efficiency |
A term used to describe the measure of efficiency from the CCR model.
|
| Allocative efficiency |
The efficiency of a production process in converting
inputs to outputs, where the cost of production is minimized for a
given set of input prices. Allocative efficiency can be calculated
by the ratio of cost efficiency to technical efficiency. |
| BCC |
The BCC (ratio) model is the DEA model used in Frontier Analyst
when a variable returns to scale relationship is assumed between inputs
and outputs. It is named BCC after Banker, Charnes and Cooper who
first introduced it in a article published in Management Science (1984,
Vol. 30/9, pp. 1078-1092). The BCC model measures technical efficiency.
The convexity constraint in the model formulation ensures that the
composite unit is of similar scale size as the unit being measured.
The efficiency score obtained from this model gives a score which
is at least equal to the score obtained using the CCR model. |
| Categorical variable |
Categorical variables can only assume a predefined set of discrete
values. For example, when analyzing a chain of retail outlets, the
analyst may want to represent the existence of a particular service,
say cash dispenser machines at each outlet. Outlets are assigned a
'1' to indicate where the service is available and a '0' where the
service is not available. Categorical variables are generally used
to indicate the presence or lack of a particular attribute. The use
of categorical variables requires modifications to the DEA models. |
| CCR |
The CCR (ratio) model is probably the most widely used and best
known DEA model. It is the DEA model used in Frontier Analyst when
a constant returns to scale relationship is assumed between inputs
and outputs. It was the first DEA model to be developed, named CCR
after Charnes, Cooper and Rhodes who introduced this model in an article
published in the European Journal of Operational Research (1978, Vol.
2 pp. 429-444). This model calculates the overall efficiency for each
unit, where both pure technical efficiency and scale efficiency are
aggregated into one value. |
| Composite unit |
The attributes of a composite unit(which is a hypothetical efficient
unit) are determined by the projection of an inefficient unit, through
the origin, to the efficiency frontier. The attributes are formed
from the DMU's (units) reference units, in the proportions indicated
by the dual weights. |
| Constant returns to scale |
Constant returns to scale may be assumed if an increase in a unit's
inputs leads to a proportionate increase in its outputs i.e. there
is a one-to-one, linear relationship between inputs and outputs. For
example, if a 10% increase in inputs yields a 10% increase in outputs,
the unit is operating at constant returns to scale. This means that
no matter what scale the unit operates at, its efficiency will, assuming
its current operating practices, remain unchanged. |
| Controlled (discretionary) inputs |
A controlled input is one over which the management of the unit
has control and, as a result, can alter the amount of it used. (Controlled
inputs are also sometimes referred to as discretionary inputs). |
| Convexity constraint |
The convexity constraint, which forms part of the formulation of
the BCC model, ensures that each composite unit is a convex combination
of its reference units. For a full definition of convexity, please
refer to a standard non-linear programming text, such as "Nonlinear
programming", Bazaraa et al. Wiley 1993 |
| Correlation coefficient |
A measure of the strength of the relationship between
two variables. A relationship exists between two variables when as
the value of one variable changes the other variable changes, in a
related manner. The value of the correlation coefficient lies between
+1 and -1. If larger values of one variable are reflected with larger
values of the other, then the value of the coefficient will be positive.
The stronger the relationship the closer the coefficient will be to
+1. If larger values of one variable are reflected with smaller values
of the other variable then the value of the coefficient will be negative.
The stronger the relationship the closer the coefficient will be to
-1. If there is no relationship between the variables the coefficient
will be close to zero. |
| Cost efficiency |
Cost efficiency (which is also known as economic efficiency) is
the ratio of the minimum cost to the actual (observed) cost. |
| Cross efficiency matrix |
A tool used to help with the identification of efficient operating
practices. A unit with a high average efficiency, from a cross efficiency
matrix, offers a good comparator for inefficient units to work towards.
A cross efficiency matrix consists of rows and columns (i x j) each
equal to the number of units in the analysis. The efficiency of unit
j, is computed with the optimal weights for unit i. The higher the
values given in the column j, the more likely it is that the unit
j is an example of good (truly efficient) operating practices. |
| Data Envelopment Analysis. (DEA). |
Data envelopment analysis is a non-parametric technique, used for
performance measurement and benchmarking. It uses linear programming
to determine the relative efficiencies of a set of homogeneous (comparable)
units. It is a "process based" analysis, in other words, it can be
applied to any unit based enterprise, regardless of whether or not
a "profit" figure is involved in the evaluation. The use of DEA also
overcomes some of the problems with traditional performance measurement
methods, such as ratio analysis and regression analysis. (DEA is the
core analysis used by Frontier Analyst, to which Banxia Software has
added a variety of extra features, such as regression analysis, to
make an efficiency study easier and to provide a comprehensive efficiency
analysis tool). |
| Data set |
The data set is the group of units (DMU's) and the values of their
inputs and outputs to be included in the analysis. The data set is
usually presented in tabular form (often initially in a spreadsheet),
where the unit names constitute the rows and the input and output
variables constitute the columns. Zero values are not allowed in DEA
and where the value of an input or output is missing, that particular
unit may have to be omitted from the data set (unless a substitute
value can be agreed upon). |
| Decision making unit. (DMU). |
Decision making unit was the name used by Charnes et al (1978)
to describe the units being analyzed in DEA. The use of this term
is intended to redirect the emphasis of the analysis from profit making
businesses to decision making entities. In other words, the analysis
which is performed can be applied to any unit based enterprise and
need have nothing to do with profit. |
| Decreasing returns to scale. (DRS). |
Decreasing returns to scale. (DRS). Decreasing returns
to scale are operating when an increase in a unit's inputs result
in a less than proportionate increase in its outputs. |
| Dual model |
The dual model and the primal (CCR) model provide two ways of looking
at the same problem and the efficiency scores calculated are the same
with both. Mathematically, the dual model is much faster to solve
(although its formulation looks more complex). The difference between
the two is that for each unit the dual model (internally) tries to
create a hypothetical composite unit, from the existing units, that
will out-perform the unit being analyzed. If, within the dual model
this composite unit can be created, then the original unit is found
to be inefficient, otherwise the unit is efficient. |
| Dual weights (l) |
The dual weights (l) - so called because
they are calculated using the dual model and sometimes also called
dual multipliers - give an indication of the importance given to a
particular unit in determining the input/output mix of the composite
unit. In the primal model the weights are associated with the (inputs
and outputs in the model). In the dual model the weights are associated
with the DMU's. |
| Efficient/ efficiency frontier |
The efficiency frontier is the frontier (envelope) representing
"best performance" and is made up of the units in the data set which
are most efficient in transforming their inputs into outputs. The
units that determine the frontier are those classified as being
100% efficient. Any unit not on the frontier has an efficiency rating
of less than 100%.
Empirical production function, empirical production envelope and
envelopment surface are all terms which are analogous to efficient
frontier.
|
| Efficiency score |
DEA results in each unit being allocated an efficiency score. This
score is between zero (or 0%) and 1 (100%). A unit with a score of
100% is relatively efficient. Any unit with a score of less than 100%
is relatively inefficient, e.g. a unit with a score of 60% is only
60% as efficient as the best performing units in the data set analyzed.
The efficiency score obtained by a unit will vary depending on the
other units and factors included in the analysis. Scores are relative,
not absolute - they are relative to the other units in the data set.
The web page http://www.banxia.com/faworks.html gives an explanation
of how DEA works. |
| Efficiency study |
The process of studying efficiency within an organisation. |
| Envelopment form |
This term is used to describe the formulation of a DEA model which
involves the concept of composite units. |
| Epsilon (e) |
Epsilon is a very small positive constant (which at the time of
writing is taken as 1 x 10-6 in Frontier Analyst) which is a non-Archimedean
variable. This means that no real number exists by which you could
multiply epsilon to get a smaller number. Epsilon is a theoretical-mathematical
device to allow us to drive slack variable values to zero, without
adding or subtracting any "real" amount to the objective function.
In practice this means that inputs and outputs are not "abused as
free commodities" (Miliotis 1992) and avoids a unit being wrongly
classified as efficient. |
| Environmental factor |
An environmental factor is neither an economic resource nor a product
but rather an attribute of the environment in which the units operate.
An environmental factor which adds resource can be included as an
input, e.g. an analysis of a chain of shops may include a factor which
measures the strength of competition a shop faces in its area. Environmental
factors may be measured directly or through the use of surrogate measures. |
| Facet |
Each of the segments which make up the efficient frontier is known
as a facet. Generally, where efficient units make a reference set,
they are located on the same facet. Facet and reference set refer
to the same concept. |
| Global leader |
A global leader will act as a model of good operating practice
for inefficient units. Oral and Yolalan (1990) define a global leader
as an efficient unit which appears most frequently in the reference
set for inefficient units. |
| Homogeneous |
A DEA study requires a set of homogeneous units. Homogeneity refers
to the degree of similarity between units. The operational goals of
the units should be similar, as should their operational characteristics. |
| Increasing returns to scale |
Increasing returns to scale exist when an increase in a unit's
inputs yields a greater than proportionate increase in its outputs. |
| Inefficient unit |
An inefficient unit is one which, when compared with
the actual performance achieved by other units in the analysis, should
be able to produce its current level of outputs with fewer inputs
or generate a higher level of outputs given the same inputs. |
| Inputs |
An input is any resource used by a unit to produce its outputs (products
or services). This can include resources which are not a product but
are an attribute of the environment in which the units operate. They
can be controlled or uncontrolled. |
| Input minimization |
Input minimization is the DEA mode adopted when the analysis tries
to minimize the amount of inputs used to produce the specified outputs.
(The opposite of input minimization is output maximization). |
| Input orientated |
Input orientated is a term used in conjunction with the BCC and
CCR ratio models, to indicate that an inefficient unit may be made
efficient by reducing the proportions of its inputs but keeping the
output proportions constant (see also input minimization and output
maximization). (Note: the CCR model will yield the same efficiency
score regardless of whether it is input or output orientated. This
is not the case with the BCC model). |
| Input/output mix |
The term "input/ output mix" refers to the relative proportions
of a unit's inputs and outputs. |
| Intensity factor. (Z). |
In the dual model the scalar, Z, is the intensity
factor. The intensity factor indicates the proportional reduction
in inputs (when using input minimization) or the increase in outputs
(if using output maximization) to achieve efficiency. |
| Local returns to scale |
Local returns to scale describes what happens to a units outputs
when the input levels are changed. |
| Most productive scale size. (MPSS). |
The most productive scale size of an efficient unit refers to the
point (on the efficient frontier) at which maximum average productivity
is achieved for a given input/ output mix. At MPSS constant returns
to scale are operating. After reaching MPSS, decreasing returns to
scale set in. |
| Multiplier form |
Associated with both the BCC and CCR models the multiplier form
is both a primal and a dual formulation. The multiplier form of DEA
model formulation involves virtual multipliers (see Ali and Seiford
1993). |
| Ordinal variable |
A special type of categorical variable where the factor takes on
a predefined set of values ranked in a specific order. |
| Outlier |
An outlier (some times in statistics referred to as an "obscene
outlier") is a unit whose input/output mix differs significantly from
the other units in the data set. Where an outlier is found to be efficient,
it may introduce bias into the results. |
| Output |
Outputs are the products (goods, services or other
outcome) which result from the processing and consumption of inputs
(resources). An output may be physical goods or services or a measure
of how effectively a unit has achieved its goals. |
| Output maximization |
Output maximization is the DEA mode adopted when the analysis tries
to maximize the outputs produced for a fixed amount of inputs. (The
opposite of output maximization is input minimization). |
| Output orientated |
Output orientated is a term used in conjunction with the BCC and
CCR ratio models, to indicate that an inefficient unit may be made
efficient by increasing the proportions of its outputs while keeping
the input proportions constant (see also input minimization and output
maximization). (Note: the CCR model will yield the same efficiency
score regardless of whether it is input or output orientated. This
is not the case with the BCC model). |
| Peer group |
Another name for a Reference Set |
| Primal (CCR) model |
Some authors differ on which model should be referred to as the
primal (the first or original) model and which should be referred
as the dual model. Some refer to the dual model as primal because
it illustrates better the principles of DEA. Throughout this glossary
and all other Frontier Analyst literature, the primal model is that
referred to by Charnes et al in their original publication (Charnes
et al 1978. See CCR for full reference). The primal model allows a
set of optimal weights to be calculated for each variable (input and
output) to maximize a unit's efficiency score. The weights are such
that were these weights applied to any other unit in the data set
the efficiency score would not exceed 1 (or 100%). |
| Production function |
The production function describes the optimal relationship between
inputs and outputs with the aim of maximising output for the given
inputs. In DEA the equivalent of the production function is the efficiency
frontier. |
| Productive efficiency. (Efficiency). |
Productive efficiency (often just referred to as efficiency)
is a measure of a unit's ability to produce outputs from a given set
of inputs. (Norman and Stoker. 1991). The efficiency of a DMU is always
relative to the other units in the set being analysed, so the efficiency
score is always a relative measure. A unit's efficiency is related
to its radial distance from the efficient or efficiency frontier (see
radial measure). It is the ratio of the distance from the origin to
the inefficient unit, over the distance from the origin to the composite
unit on the efficient frontier. |
| Productivity |
In the case of a process with a single input and a single output,
productivity is the ratio of the unit's outputs to its inputs. DEA
does not measure productivity, it measures the efficiency of the production
process. Productivity is a function of production technology, the
efficiency of the production process and the production environment. |
| Radial measure |
Both the BCC and CCR ratio models use a radial or proportional
measure to determine a unit's efficiency score. A unit's efficiency
is defined by the ratio of the distance from the origin to the inefficient
unit, divided by the distance from the origin to the composite unit
on the efficient frontier. |
| Ratio models |
Both the BCC and CCR models are called ratio models because they
define efficiency as the ratio of weighted outputs divided by weighted
inputs. |
| Reference contribution |
Reference contribution indicates the degree to which a reference
unit contributes to the calculation of the efficiency score for a
unit. |
| Reference set |
The reference set of an inefficient unit is the set of efficient
units to which the inefficient unit has been most directly compared
when calculating its efficiency rating. It contains the efficient
units which have the most similar input/output orientation to the
inefficient unit and should therefore provide examples of good operating
practice for the inefficient unit to emulate. |
| Results |
Having conducted an analysis, the DEA model will produce, for each
unit, an efficiency score, virtual multipliers, intensity factors,
the dual weights and the slacks. From these are calculated the virtual
inputs and virtual outputs, the reference sets and improvement targets
for each unit. |
| Scale efficiency |
Scale efficiency A unit is "scale efficient" when its
size of operation is optimal. If its size of operation is either reduced
or increased its efficiency will drop. A scale efficient unit is operating
at optimal returns to scale. Scale efficiency is calculated by dividing
aggregate efficiency (from the CCR model) by technical efficiency
(from the BCC model). |
| Slack(s) |
Slack represents the under production of output or
the over use of input. It represents the improvements needed to make
an inefficient unit become efficient. These improvements are in the
form of an increase/decrease in inputs or outputs. |
| Surrogate measures |
Where a measure is "intangible", in the sense that
no quantitative data exists for it, then surrogate measures can to
be used. Surrogate measures are used to represent factors such as
environment factors, for example a "score" for the type of neighborhood
in which a unit operates, or the achievement of an organizational
goal (which does not have a statistically quantifiable outcome) and
so on. |
| Targets |
The values of the inputs and outputs which would result in an inefficient
unit becoming efficient. |
| Technical efficiency |
A unit is said to be technically efficient if it maximizes output
per unit of input used. Technical efficiency is the efficiency of
the production or conversion process and is calculated independently
of prices and costs. Technical efficiency is calculated using the
BCC model. The impact of scale size is ignored as DMU's are compared
only with units of similar scale sizes. |
| Uncontrolled (exogeneously fixed) inputs/ outputs |
An uncontrolled or uncontrollable variable (input
or output) is one over which the unit's management does not have control
and hence cannot alter its level of use or production. An example
of an uncontrolled input for a retail outlet would be the number of
competitors it had in its area. Uncontrollable variables are also
referred to as exogeneously fixed and non-discretionary variables. |
| Unit |
A "unit" is simply a shorthand for "decision making unit" or "DMU".
Units may be outlets in a branch network of banks or shops. They may
be wards in hospitals or direct labor organizations in a public authority.
Data envelopment analysis can be applied to any unit based process. |
| Variable |
Variables are the input and output factors identified as being of
particular importance to the operation of the units under consideration.
For example, number of employees, patients treated (per hour), floor
space, sales, rent, number of transactions and so on. Classification
as inputs or outputs depends on the process being measured and the
goals against which units are being measured. What may be an input
when measured against one set of goals, may be an output when considered
under another. |
| Variable returns to scale |
If an increase in a unit's inputs does not produce a proportional
change in its outputs then the unit exhibits variable returns to scale.
This means that as the unit changes its scale of operations its efficiency
will either increase or decrease. |
| Virtual input/output |
Virtual inputs are calculated by multiplying the value of the input
with the corresponding optimal weight for the unit as given by the
solution to the primal model. Similarly for virtual outputs. Virtual
inputs/ outputs define the level of importance attached to each factor.
The sum of the virtual inputs for each unit always equals 1. The sum
of the virtual outputs is equal to the unit's efficiency score. |
| Virtual multipliers |
Another term used to describe weights. |
| Weight flexibility.(Weighting/ User defined weights). |
The CCR (primal) model does not place any restrictions on the weights
in the model, other than a minimum (lower bound) on epsilon, as a
result it is possible for units to be rated as efficient through a
very uneven distribution of weights. This can mean that some or most
of the variables have been pretty much ignored. The Wong and Beasley
(1990) weighting method (implemented in Frontier Analyst Professional)
can be used to add weight restrictions to the model, if it is observed
that the kind of bias described above is occurring. |
| Weights |
Within DEA models weights are the 'unknowns' which are calculated
to determine the efficiency of the units. The efficiency score is
the weighted sum of outputs divided by the weighted sum of inputs
for each unit. The weights are calculated to solve the linear program,
in such a way that each unit is shown in the best possible light.
Weights indicate the importance attached to each factor (input/ output)
in the analysis. |
| Window analysis |
Window analysis is a tabular method which allows an
analysis of efficiency changes over time. The user chooses a set of
time periods and then calculates the efficiency of each unit for each
time period. The efficiency of a given unit over each of the time
periods is treated as a new unit. |