NOMENCLATURE
- BP
Basic Performance
- CA
Classification Accuracy
- CCSD
Correlation Coefficient and Standard Deviation
- CE
Conditional Entropy
- CRITIC
CRiteria Importance Through Inter-criteria Correlation
- CS
Cosine Similarity
- DM
Decision Maker
- DS
Decision System
- DSOM
Decision System Objective Method
- EEF
Effectiveness Evaluation of Fighter
- FBRW
Fuzzy Bayes Risk Weight assignment method
- GJC
Generalized Jaccarb Coefficient
- GRA
Grey Relation Analysis
- IS
Information System
- ISOM
Information System Objective Method
- JC
Jaccarb Coefficient
- LTCC
Longitudinal deviation and Transverse residual Correlation Coefficient
- MADA
Multiple Attribute Decision Analysis
- MP
Manoeuvre Performance
- NRS
Neighbourhood Rough Set
- PCC
Pearson Correlation Coefficient
- SD
Standard Deviation
- SMC
Simple Matching Coefficient
- WDBC
Wisconsin Diagnostic Breast Cancer
- a
an attribute
- A
attribute set
- c
a conditional attribute
- C
conditional attribute set
- d
a class of decision
- D
decision attribute
- I
information function
- LCC
Longitudinal Correlation Coefficient
- N
neighbourhood
- P
conditional probability
- r
region
- R
Bayes risk
- TCC
Transverse Correlation Coefficient
- U
universe
- V
value
- w
weight
- W
weight vector
- x
sample
- △
distance metric
- δ
neighbourhood threshold
- γ
loss function
Greek Symbol
1.0 INTRODUCTION
The Effectiveness Evaluation of Fighter (EEF) is one of the most common approaches to measure the capabilities of fighter to accomplish some specific tasks, which could be applied to many aspects such as fighter design, combat simulation and military might comparison(Reference Zhu, Zhu and Xiong1). There are several categories of methods for EEF such as Analytic Hierarchy Process (AHP)(Reference Ma, He, Ma and Xia2), Availability-Dependability-Capability (ADC)(Reference Liu and Li3), synthesised index method(Reference Dong, Wang and Zhang4), Fuzzy Evaluation (FE)(Reference Zhu, Zhu and Xiong1) and Multiple Attributes Decision Analysis (MADA)(Reference Wang, Zhang and Xu5), etc. Therein, MADA, known to be simple and practical, can only rely on the characteristics of the data in the index system to obtain the evaluation results. In the MADA method, the decision result of the object is obtained by the weighted sum so as to evaluate the comprehensive performance of the object.
Attribute weight assignment plays a significant role in MADA, which can be generally divided into three categories of methods, i.e. subjective methods(Reference Forman and Gass6,Reference Hwang and Lin7) , objective methods(Reference Deng, Yeh and Willis Robert9,Reference Wang and Luo13,Reference Diakoulaki, Mavrotas and Papayannakis14) and hybrid methods(Reference Tahib, Yusoff, Abdullah and Wahab15-Reference Dong, Xiao, Zhang and Wang18), according to the extent of dependence on the preferences or subjective judgements of Decision Makers (DMs)(Reference Yang, Yang, Xu and Khoveyni8). In practical applications, some ideal weights are usually hard to be obtained by the subjective or hybrid methods when there is a lack of related field experts or no unanimous conclusion reached by DMs(Reference Chin, Fu and Wang19,Reference Fu and Xu20) . Fortunately, the objective weight methods can effectively solve the above problem, because attribute weights are generated by using data rather than the DMs’ reference.
In terms of the data systems used in MADA, they can be broadly divided into two categories, namely Decision System (DS) and Information System (IS), of which DS is a set of data consisting of conditional attributes and decision attributes, and IS does not include decision attributes, i.e. labels. With respect to DSs, although they contain decision attributes, there are still many issues in MADA, such as effectiveness evaluation(Reference Sahoo, Sahoo, Dhar and Pradhan21), classification(Reference Ishibuchi and Yamamoto22) and fault diagnosis(Reference Suo, Zhu, Zhou, An and Li23). According to the applied data systems, there are two parts regarding the objective methods, including Information System Objective Method (ISOM) and Decision System Objective Method (DSOM). Based on these two categories, further introduction of objective methods are given in detail.
Most objective weight assignment methods aim at IS. Among them, Entropy method(Reference Yang, Yang, Xu and Khoveyni8,Reference Deng, Yeh and Willis Robert9) is the most popular one in ISOMs, based on which a number of approaches are developed to obtain more satisfactory results of weight assignment(Reference Yang, Yang, Xu and Khoveyni8,Reference Valkenhoef and Tervonen10,Reference He, Guo, Jin and Ren11) . To mention a few, Valkenhoef and Tervonen(Reference Valkenhoef and Tervonen10) discussed the entropy-optimal weight constraint elicitation problem with additive multi-attribute utility models. He et al(Reference He, Guo, Jin and Ren11) proposed a linguistic entropy weight method to determine the attribute weights in the linguistic MADA. Yang et al(Reference Yang, Yang, Xu and Khoveyni8) designed a three-stage hybrid weight assignment approach based on entropy theory. In this category, another widely used method is Principal Components Analysis (PCA)(Reference Jolliffe12). Moreover, Diakoulaki et al(Reference Diakoulaki, Mavrotas and Papayannakis14) raised a weight determination method based on the quantification of two fundamental nations of MADA: the contrast intensity and the conflicting character of the evaluation criteria, which is named CRiteria Importance Through Inter-criteria Correlation (CRITIC). Deng et al(Reference Deng, Yeh and Willis Robert9) employed Standard Deviation (SD) to obtain the weights of attributes. In order to consider the relationships of attributes, Wang and Luo(Reference Wang and Luo13) proposed a Correlation Coefficient and Standard Deviation (CCSD) integrated method for determining attribute weights. To the best of the authors’ knowledge, all of the above-mentioned methods do not take account of the contributions of decision attributes to the determination of conditional attribute weights, when applying to DS. With respect to DS, there is usually a single decision attributeFootnote 1 that can be regarded as a generalisation of the overall system and an abstract of all the conditional attributes. Each conditional attribute provides a particular contribution to its system and an individual support degree to the abstract of the decision attribute, which could be depicted by the weight of the conditional attribute. Therefore, the weight determination of the conditional attribute cannot ignore the role of the decision attribute in DS. The approaches, such as the Conditional Entropy (CE)(Reference Liang, Chin, Dang and Yam24) approach, Grey Relation Analysis (GRA)(Reference Deng25) approach and Rough Set (RS)(Reference Suo, An, Zhou and Li26) approach, could be considered as the alternatives for the weight assignment of the conditional attribute in DS because of taking into account the coupling relationships between the conditional attributes and the decision attribute.
In fact, with regard to MADA, the final decision produced by any decision-making unit will be accompanied by some risks. These risks are usually derived from the difference in the distribution of data between the conditional attributes and the decision attribute. Consequently, each of the conditional attributes will generate a unique risk for the final decision, which could employ the weight of attribute as a metric. In addition, the fuzzy membership of a sample induced by conditional attributes to a decision attribute can be considered as a kind of main relationship between the two types of attributes. However, the aforementioned objective methods, whether ISOMs or DSOMs, have not taken the decision risk as a main factor to determine the weights of attributes, and not taken the above fuzzy membership into account as well. Furthermore, with respect to the weight assignment in a multi-layer attribute set, it usually needs the help of experts (DMs) or is achieved through some complex combination methods(Reference Tahib, Yusoff, Abdullah and Wahab15), which have greatly limited the application of a weight assignment in a multi-layer index system. These inadequacies of the present research motivate this work.
To handle the aforementioned issues and overcome the deficiencies of the existing methods, we propose a simple and effective objective attribute weight assignment method based on a Fuzzy Bayes Risk (named FBRW), which is not only suitable for ISs and DSs, but can be also applied to single-layer and multi-layer index systems. Bayes risk utilises the calculation of probability to estimate the risk of event, which takes full account of the causality and dependency between each event, i.e. the relationship of the conditional attributes and the decision attribute in DS. Therefore, Bayes risk is very suitable for estimating the weights of conditional attributes. The loss function in Bayes risk, however, is usually determined by experts or through a large number of statistical tests(Reference Kumar and Byrne35,Reference Gonzlez-Rubio and Casacuberta36) , which greatly limits the practical application and extension of Bayes risk theory. Based on Gaussian kernel, a loss function model is proposed to cope with this drawback, in which the loss values of samples are obtained by the distribution characteristics of data. Furthermore, considering the fuzziness of the data system, the fuzzy similarity of each sample and the fuzzy membership between conditional attributes and the decision attribute are employed in our method. On the other hand, as a significant part of the weight assignment, the weight evaluation has not been paid enough attention in literature. Hence, we propose a correlation coefficient named Longitudinal deviation and Transverse residual Correlation Coefficient (LTCC) that considers two directions, i.e. the longitudinal direction and the transverse direction, to measure the similarity between the assigned weights and the reference ones. Subsequently, a number of comparison experiments are carried out to illustrate the superiority of the proposed method. Finally, we demonstrate and verify the applicability of the proposed method through the effectiveness evaluation of fighter. Therefore, the main contributions of this work lie in that
(1) A simple and effective objective attribute weight assignment method based on fuzzy Bayes risk is proposed.
(2) A Gaussian kernel loss function model is raised, which could promote the application and extension of Bayes risk theory.
(3) A longitudinal deviation and transverse residual correlation coefficient model is raised for weight evaluation.
(4) The demonstration and analyses of the fighter effectiveness evaluation have guiding significance for other similar engineering applications.
The remainder of this paper is organised as follows. Section 2 introduces the preliminaries for this work. The basic theories and analyses of the proposed method are presented in Section 3. The weight evaluation model is depicted in Section 4. The results and analyses of numerical experiments are given in Section 5, and the effectiveness evaluation of fighter is demonstrated in Section 6. Then, some discussions are brought in Section 7. Finally, conclusions and future work are described in Section 8.
2.0 PRELIMINARIES
This work takes the decision system as the research object, and employs the Bayes risk and neighbourhood relationship to realise the assignment of attribute weight. Thus, in this section, the aforementioned theories and concepts are introduced, which will pave the way for the further development of the following sections.
Definition 1. (Decision system)(Reference Yao28) A decision system is a 4-tuple DS = (U, {A|A = C∪D}, {V a|a ∈ A}, {I a|a ∈ A}), where U is a finite set of objects called universe and U = {x 1, x 2, ⋅⋅⋅, x m}, A is the attribute set, C is the set of conditional attributes, D is the decision attribute, C∩D = ∅, D ≠ ∅, V a is a set of values of each a ∈ A, and I a is an information function for each a ∈ A.
On account of such a form of the four tuples (i.e. U, A, V, I) and the three basic elements (i.e. U, C, D) of the decision system in Definition 1, a decision system is often denoted as DS = (U, A, V, I) or DS = (U, C∪D) for short. It is noted that the above definition of the decision system only considers one decision attribute, and such a definition has been widely used in practical applications. In fact, systems with multiple decision attributes can be also transformed into the ones with single decision attribute. Specifically, a decision system is also called an information system IS = (U, C) if its decision attribute forms an empty set(Reference Zhu, Zhu and Fan29).
Definition 2. (Neighbourhood)(Reference Hu, Yu, Liu and Wu27) Given an arbitrary sample x i ∈ U and a conditional attribute subset B⊆C, a neighbourhood N B(x i) of x i in B is defined as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn1.gif?pub-status=live)
where △ is a metric, δ is a threshold.
In order to consider the fuzziness between each sample, a fuzzy similarity relation(Reference Wang, Shao, He, Qian and Qi33) is introduced as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn2.gif?pub-status=live)
where a ∈ C, δ is a neighbourhood threshold and 0 < δ ⩽ 1. Obviously, the following properties hold:
(1) △a(x i, x j) = △a(x j, x i);
(2) △a(x i, x i) = 1;
(3) 0 ⩽ △a(x i, x j) < 1.
The samples with respect to x i satisfied the first condition in Equation (2), are denoted by N a(x i), and named the fuzzy neighbourhood of x i.
For a subset B⊆C, a ∈ B, the fuzzy neighbourhood relation △B(x i, x j) between x i and x j is defined as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn3.gif?pub-status=live)
where ∩ is the fuzzy conjunction.
Definition 3. (Bayes risk)(Reference Duda, Hart and Stork30) Given a domain of objects X (X = {x 1, x 2, ⋅⋅⋅, x m}) and a set of classes Y (Y = {y 1, y 2, ⋅⋅⋅, y n}). For a classification function C : X → Y that maps each object to one class, the risk of classifying x i (x i ∈ X) into y p (y p ∈ Y) is defined as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn4.gif?pub-status=live)
where λpq is a loss function that measures the error of classifying object x i into class y p knowing that the possible class is y q, and P(y q|x i) is the probability of object x i belonging to class y q.
It is worth noting that different loss functions will yield different decision risks. In practical applications, the loss function is usually difficult to be provided, which has been the key bottleneck of the application and extension of Bayes risk theory.
3.0 WEIGHT ASSIGNMENT BASED ON FUZZY BAYES RISK
Through the previous analyses in Sections 1 and 2, we can know that the decision risk can be considered as an important factor in the attribute weight determination in DS. Therefore, in this section, the single-layer dataset is taken as an example to analyse the presented form of the above-mentioned risk in DS. Subsequently, the FBRW method is designed. Finally, the FBRW method is extended for the weight assignment in a multi-layer dataset.
3.1 FBRW for single-layer DS
Generally, data-driven attribute weight is determined based on a single-layer dataset(Reference Deng, Yeh and Willis Robert9,Reference Wang and Luo13) . Considering a single-dimensional conditional attribute c (c ∈ C) and D = {d 1, d 2, ⋅⋅⋅, d K} in DS, the distribution of sample set X (X = {x 1, x 2, ⋅⋅⋅, x m}) in c with respect to D can be depicted as Fig. 1, where d i, d j, d k ∈ D.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_fig1g.jpeg?pub-status=live)
Figure 1. The distribution of X in c with respect to D.
From Fig. 1, we can see that decision making in region r 1 is bound to generate risks because of the overlaps of three probability distributions. Region r 2, in particular, will produce a riskier decision than r 3 and r 4 because there are three distributions in it.
With the help of a neighbourhood relationship, we transform the above distributions into discrete space, where the basic cells are the neighbourhood set instead of sample set. We take a set of neighbourhoods with regard to decision class d i as an example, and the distribution of neighbourhood sets N in c is depicted as Fig. 2.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_fig2g.jpeg?pub-status=live)
Figure 2. The distribution of N in c with respect to d i.
In Fig. 2, N k (N k ∈ N, k = 1, 2, ⋅⋅⋅, 8) is the neighbourhood of sample x k, the red regions denote that the samples in neighbourhood N k are classified into d i in terms of a given metric, and the grey regions are the ones classified into other classes except d i.
According to the above analyses and the Definition 3 of Bayes risk, the risk produced by a conditional attribute could be the error that measures the difference between the distribution of the current conditional attribute and that of the decision attribute. Therefore, we can derive the principle of attribute weight assignment in DS, as such an attribute should be assigned a greater weight whose Bayes risk with respect to decision attribute is less. Therefore, the definition of Bayes risk with respect to DS can be drawn as follows.
Definition 4. (Bayes risk in DS) Given a decision system DS = {U, C∪D, V, I}, U = {x 1, x 2, ⋅⋅⋅, x m}, D = {d 1, d 2, ⋅⋅⋅, d K}, for an arbitrary sample x i (x i ∈ U), it may be divided into any decision classes of D with respect to an attribute c (c ∈ C) by using some metrics, but it belongs to a certain class d k(d k ∈ D) in terms of the information function I. Therefore, the Bayes risk of x i vesting in d k with respect to c is defined as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn5.gif?pub-status=live)
where λkj(c, xi) is a loss function that measures the loss relative to its own class d k when x i is classified into the possible class d j, and P(d j|x i) is the probability of x i belonging to d j.
The commonly used loss function is 0–1 type, but it can not effectively evaluate the real loss of decision making. In order to improve the effectiveness of loss function, it is usually determined by experts or through a large number of statistical tests(Reference Kumar and Byrne35,Reference Gonzlez-Rubio and Casacuberta36) , which has been a stumbling block to the application and extension of this theory. Therefore, we propose a loss function based on Gaussian kernel in which the loss values of samples are depicted by the distribution characteristics of data.
Definition 5. (Gaussian kernel loss function) Given a decision system DS = (U, C∪D), c ∈ C, D = {d 1, d 2, ⋅⋅⋅, d N}, for an arbitrary sample x i in U, its designated decision class is d k and its possible class is d j produced by a given metric, d j, d k ∈ D. Then, the Gaussian kernel loss function of x i relative to c is defined as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn6.gif?pub-status=live)
where μk is the expectation of the sample set belonging to class d k with respect to c, and σk is the corresponding standard deviation, k, j ∈ {1, 2, ⋅⋅⋅, N}. Usually, we take the loss function as λkj for short.
Remark 1. There are three aspects about the Gaussian kernel loss function: (a) the loss of the sample is 0 if it is divided into its own decision class, i.e. k = j; (b) the smaller the distance between the sample and the expectation, the greater the loss will be; (c) the loss is 1 if the standard deviation is 0, which means that all the data in this class are equal to each other, i.e. every datum is the expectation.
Through the above definition and remark, we can see that the Gaussian kernel loss function model could accord with the definition of loss function in Definition 4.
Definition 6. (Probability) Given a decision system DS = (U, C∪D), for an arbitrary sample x i ∈ U, its neighbourhood is N(x i) = {x 1, x 2, ⋅⋅⋅, x m}, and the corresponding decision set is N(d) = {d 1, d 2, ⋅⋅⋅, d p}, N(d)⊆D. Thus, the probability of x i classifying to d j (d j ∈ N(d)) is denoted as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn7.gif?pub-status=live)
where x j is the sample belonging to decision class d j in terms of the information function in this DS and △ is the fuzzy similarity relation between x i and x k. This definition can also be called fuzzy membership due to the fuzzy similarity relation, and the Bayes risk in Definition 4 can be named as Fuzzy Bayes Risk (FBR), which is more succinct than our previous result(Reference Suo, Zhu, Zhang, An and Li37).
Theorem 1. The Bayes risk (Definition 4) defined as Rc(dk|xi) = ∑Kj = 1λkjP(dj|xi) is equivalent to Rc(dk|xi) = λk ~ k(1 − P(dk|xi)), where d ~k is the decision class except for d k.
Proof. According to Equation (6), the loss function can be rewritten as $\lambda _{\sim k}^k = \exp \left(- \frac{(x_i - \mu _k)^2}{2\sigma _k^2}\right)$ if k ≠ j and λkk = 0 (k = j). Therefore, the risk function can be written as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqnU1.gif?pub-status=live)
Thus, Rc(dk|xi) = ∑Kj = 1λkjP(dj|xi) is equivalent to Rc(dk|xi) = λk ~ k(1 − P(dk|xi)).
The above Theorem 1 greatly reduces the computational complexity of the Bayes risk, which will be helpful for promoting the proposed method.
Remark 2. Through the above theorem and its corresponding proof, the loss function can be rewritten as $\lambda _{\sim k}^k = \exp \left(- \frac{(x_i - \mu _k)^2}{2\sigma _k^2}\right)$.
Theorem 2. The Bayes risk (Definition 4) satisfies that 0 ⩽ R c(d k|x i) < 1.
Proof. According to Theorem 1, the Bayes risk could be written as Rc(dk|xi) = λk ~ k(1 − P(dk|xi)), and $\lambda _{\sim k}^k = \exp \left(- \frac{(x_i - \mu _k)^2}{2\sigma _k^2} \right)$. The loss function satisfies that 0 < λk ~ k ⩽ 1. On the other hand, the probability satisfies that 0 < P(d k|x i) ⩽ 1 according to Definition 6. Therefore, the above theorem holds.
According to the aforementioned definitions, we can derive the definition of attribute weight based on the proposed fuzzy Bayes risk as follows.
Definition 7. (Bayes risk weight) Given a decision system DS = (U, C∪D), U = {x 1, x 2, ⋅⋅⋅, x m}, for an arbitrary conditional attribute c ∈ C, its weight based on fuzzy Bayes risk is denoted as:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn8.gif?pub-status=live)
where $\overline{R}_c = \frac{1}{m}\sum _{i=1}^m R_c(d_k|x_i)$, x i ∈ U, d k ∈ D.
It is easy to see that 0 ⩽ w c ⩽ 1 holds according to Theorem 2. Thus, the weight vector of DS is $\overline{W} = (\overline{w}_1, \overline{w}_2, \cdots , \overline{w}_n)$, where
$\overline{w}_c = w_c / {\sum _{c=1}^n w_c}$.
3.2 FBRW for multi-layer DS
In a practical application, the multi-layer index system is frequently used in MADA. Generally, the weight determination of multi-layer DS depends on some subjective approaches. Nevertheless, the method proposed in this paper can easily obtain the weights of each layer conditional attributes by using a neighbourhood relation.
Remark 3. The main difference of weight assignment between single-layer attribute set and multi-layer attribute set is the metric of fuzzy neighbourhood. Equation (2) is employed in a single-layer attribute weight assignment and that is Equation (3) in the multi-layer’s case.
Remark 4. The other differences of weight determination between single-layer attribute set and multi-layer attribute set are that (a) the single attribute c in the aforementioned definitions is replaced by a subset B (B⊆C); and (b) the normalised risk of an attribute set B should be rewritten as $\overline{R}_B = \frac{1}{m\cdot n}\sum _{k=1}^n \sum _{i=1}^m R_B(d_k|x_i)$, where n is the number of conditional attributes in B.
In addition, although there are some above-mentioned differences, all the theorems still hold.
3.3 FBRW algorithm
Based on the preceding theories and definitions, the algorithm of FBRW can be designed as Algorithm 1.
Algorithm 1 FBRW algorithm
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab1.gif?pub-status=live)
4.0 WEIGHT EVALUATION
To the best of our knowledge, there are no ideal methods to evaluate the weight results. The existing methods are mostly based on the consistency between the assigned weights and the actual situations to measure the quality of the methods. In fact, if we can obtain reliable importances of attributes by using some approaches, such as a large number of expert surveys and classification accuracy (CA) in DS, we can employ the similarity degree or correlation coefficient to evalute the rationality of the assigned weights. In statistics, the commonly used metrics of similarity measure are Simple Matching Coefficient (SMC), Jaccarb Coefficient (JC), Cosine Similarity (CS), Generalised Jaccarb Coefficient (GJC) and Pearson Correlation Coefficient (PCC)(Reference Tan, Steinbach and Kumar34). Therein, SMC and JC are not suitable for continuous data, CS and GJC employ vector dot product, and PCC considers the covariance and standard deviation of data. All of these methods do not take account of the fluctuation of data that includes two aspects, i.e. the longitudinal direction and the transverse direction. Therefore, we propose a kind of correlation coefficient that considers both the longitudinal deviation and transverse residual.
Definition 8. (Longitudinal correlation coefficient) Given two vectors X = {x 1, x 2, ⋅⋅⋅, x m} and Y = {y 1, y 2, ⋅⋅⋅, y m}, the longitudinal similarity degree between X and Y is defined as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn9.gif?pub-status=live)
where std(·) is the operation of standard deviation, $\bigsqcup$ denotes the combination of the elements in it, and | · | represents the absolute value of the elements in it.
Definition 9. (Transverse correlation coefficient) Given two vectors X = {x 1, x 2, ⋅⋅⋅, x m} and Y = {y 1, y 2, ⋅⋅⋅, y m}, the transverse similarity degree between X and Y is defined as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn10.gif?pub-status=live)
where $\widetilde{x}_i = x_i - x_{i+1}$ is the residual of x i.
Definition 10. (LTCC) Given two vectors X = {x 1, x 2, ⋅⋅⋅, x m} and Y = {y 1, y 2, ⋅⋅⋅, y m}, their longitudinal deviation and transverse residual correlation coefficient is defined as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn11.gif?pub-status=live)
It is easy to see that the following properties 0 < LCC ⩽ 1, 0 < TCC ⩽ 1 and 0 < LTCC ⩽ 1 hold. Usually, the two vectors have a strong correlation if LTCC > 0.95, and they are completely uncorrelated if LTCC < 0.5.
5.0 NUMERICAL EXPERIMENTS
In this section, we carry out two parts of experiments, one of which is the comparison experiment on the proposed correlation coefficient, and the other one includes some comparison experiments to reveal the superiority of FBRW.
With regard to the reference of weight, we employ the CA in the following experiments, which can effectively measure the importance of attributes and has been widely used in feature selection and data reduction(Reference Hu, Yu, Liu and Wu27). Therefore, it is suitable to be the reference of attribute weight determination in DS.
With the help of WekaFootnote 2, we employ as many as ten classifier methods and 10-fold cross-validation in order to guarantee that the results are highly credible. Therein, the employed ten classification algorithms produced in Weka are C4.5(J48), REPTree, NaiveBayes, SVM(SMO), IBk, Bagging, LogitBoost, FilteredClassifier, JRip and PART, and default parameters in Weka are chosen. The weights produced by CAs are the average values of the ten methods. All of the following experiments are carried out on the same platform and compiled by Matlab.
5.1 Comparison experiments on LTCC
In this subsection, we choose CS, GJC and PCC(Reference Tan, Steinbach and Kumar34) as the objects of comparison, and use some artificial data to illustrate the superiority of LTCC. The details of the artificial data are shown in Table 1 and the corresponding curves are shown in Fig. 3.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_fig3g.jpeg?pub-status=live)
Figure 3. The curves of the artificial data.
Table 1 The details of the artificial data
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab2.gif?pub-status=live)
a r( · ) is a rounding operation, b i = 1, 2, ⋅⋅⋅, 10.
We take the data of f 1 as the reference and compute the similarity degrees or correlation coefficients between the others and f 1. The results are shown in Table 2.
Table 2 The results of comparison
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab3.gif?pub-status=live)
From the results in Table 2, we can see that (a) all the methods produce the similarity degree of f 1≈f 1 is 1; (b) CS and GJC consider that f 1≈f 2 should be given a greater similarity degree, however, this is not quite consistent with the actual situation in Fig. 3; (c) if the spatial translation is not considered, the similarity degrees of f 1≈f 3 and f 1≈f 4 should be equal to 1; (d) the data of f 5 are the same as those of f 1, which are the rounding ones of f 1’s, however, there is no result generated by PCC because the standard deviation of them are 0; (e) almost all the methods do not assign f 1≈f 6 and f 1≈f 7 high similarity degrees.
Through the above analyses, we can see that by considering the factors of both longitudinal direction and transverse direction, it can evaluate the correlation between spatial vectors more reasonably. What’s more, if the standard deviation of the concerned data is not 0, the PCC method can also produce satisfied results. Therefore, in order to effectively evaluate the assigned weight results, we employ the two methods, i.e. PCC and LTCC, in the following experiments.
5.2 Comparison experiments
In this subsection, we select some commonly used objective weight determination methods and some mature methods that can produce the weights of DS as comparisons to illustrate the advantage of the proposed method. Therein, the objective methods are Entropy method(Reference Deng, Yeh and Willis Robert9), CRITIC(Reference Diakoulaki, Mavrotas and Papayannakis14), SD(Reference Diakoulaki, Mavrotas and Papayannakis14) and CCSD(Reference Wang and Luo13), and the other mature ones are GRA(Reference Deng25), CE(Reference Liang, Chin, Dang and Yam24) and Neighbourhood Rough Set (NRS)(Reference Hu, Yu, Liu and Wu27). In the GRA model, we take the decision attribute as the optimal one to measure the importance degrees of conditional attributes, which could be considered as the weights of attributes. In the CE model, we take the conditional entropy of the decision attribute with regard to the conditional attribute as the metric to produce the weight of the conditional attribute; the smaller the conditional entropy is, the greater the weight should be assigned. In addition, we select the Supervised and Multivariate Discretization Naive Scaler (SMDNS) method as the discretisation tool who has the best performance compared with other models in the literature(Reference Jiang and Sui31) because the discretisation method plays an important role in the CE model. In the NRS model, the dependencies are employed to transform into the weights of attributes, and the neighbourhood threshold is the same as that of FBRW, which is 0.2 in our method. In addition, the CA and the correlation coefficients produced by PCC and LTCC are also employed to evaluate the performance of each method, and the University of California Irvine (UCI) data in Table 3 are used in these experiments. Therein, the feature is the number of the conditional attributes in DS and the class is the number of decision categories. The comparison results are shown in Tables 4 and 5 and the bold values indicate the maximum ones.
Table 3 The details of UCI data
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab4.gif?pub-status=live)
Table 4 The results of comparison experiments using PCC
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab5.gif?pub-status=live)
Table 5 The results of comparison experiments using LTCC
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab6.gif?pub-status=live)
From the results in Tables 4 and 5, we can see that the results obtained by FBRW are the best ones; in other words, almost all the correlation coefficients of each data have reached the maximum values. It is worth noting that there is no result of Vote with regard to the NRS model because there are always some overlap regions between the classes of each discrete conditional attribute and that of the decision attribute, which produces an empty positive region and a zero-weight result in the NRS model. The above problem is a main drawback of the NRS model.
6.0 EFFECTIVENESS EVALUATION OF FIGHTER
In this section, one of the practical applications of MADA, i.e. EEF, is carried out to illustrate the validity of the proposed method.
6.1 Index system of fighters
The index system of fighters consists of two parts, i.e. the Basic Performance (BP) index set and the Manoeuvre Performance (MP) index set, and there are some sub-attributes in the two sets. The index system can be regarded as a multi-layer system shown in Fig. 4.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_fig4g.jpeg?pub-status=live)
Figure 4. The index system of fighters.
We have sorted out the index data of some typical fighters(Reference Zhu, Zhu and Xiong1) shown in Tables 11–12 in the Appendix. These index data are classified into four categories according to the generational criteria(Reference Bongers and Torres32).
6.2 Weight assignment based on FBRW
We demonstrate the calculation process of FBRW based on the above index system. For single-layer attribute weight determination, we take the basic performance index set as an example. First of all, the raw data should be normalised into the range of [0,1], where the cost normalised model $\overline{x} = \frac{\max (x)-x}{\max (x) - \min (x)}$ is used for the second attribute, i.e. take-off wing load, because the less the wing load is, the better the manoeuvreability will be in air combat. Meanwhile the income normalised model
$\overline{x} = \frac{x-\min (x)}{\max (x) - \min (x)}$ is employed for the other attributes.
Firstly, we calculate the expectation μ and the standard deviation σ of each generation produced by the basic performance indexes. Secondly, we take the index value, namely thrust of MiG-9, as an example to demonstrate the following calculation process. The normalised value of MiG-9’s thrust is 0 denoted as x 1, and its neighbourhoods are N C 11(x 1) = {x 1, x 2, x 3, x 4, x 5, x 6, x 7, x 10, x 13} if the δ is 0.2. Subsequently, the classification probability of x 1 can be calculated according to Equation (7), which is shown as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn12.gif?pub-status=live)
Then, the loss of x 1 is obtained in terms of Definition 5 and Remark 2, which is presented as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn13.gif?pub-status=live)
After that, the risk of x 1 with regard to C 11 can be produced as follows according to Definition 4 and Theorem 1.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn14.gif?pub-status=live)
Finally, we can obtain all the normalised risks of the conditional attributes, i.e. $\overline{R}_{C_1} = \lbrace 0.3546, 0.4739, 0.4739, 0.4387, 0.4223, 0.3176, 0.3097\rbrace$, and the normalised weights are
$\overline{W}_{C_1} = \lbrace 0.1533, 0.1250, 0.1250, 0.1333, 0.1372, 0.1621, 0.1640\rbrace$ according to Definition 7. Moreover, the normalised weights of manoeuvre performance indexes are
$\overline{W}_{C_2} = \lbrace 0.2676,\break 0.1828, 0.2450, 0.3046\rbrace$.
Similarly, for the multi-layer index system, we can obtain the fuzzy Bayes risks of the two index sets which are 70.9604 and 38.8923, respectively. Then, the weights of each index set can be generated as 0.4897 and 0.5103.
In addition, if we crudely combine the weight of each index to obtain the weights of the attribute sets, which are shown as follows:
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn15.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn16.gif?pub-status=live)
where
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn17.gif?pub-status=live)
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_eqn18.gif?pub-status=live)
are the un-normalised weights, and the constant coefficients 7 and 4 are the numbers of the elements in their set.
From the above results we can see that the weights determined by FBRW conforms to the standard of the evaluation of fighters in reality, i.e. the manoeuvre performance indexes (C 2) are more important than the basic performance indexes (C 1). However, the weights obtained by the simple method reveal the opposite conclusion. This shows that the weights of datasets can not be determined by such a simple and rude method, e.g. the above method.
6.3 Comparison experiments
In this subsection, we also take the aforementioned weight assignment methods in Subsection 5.2 to compare and analyse the practicability of FBRW. The weight determination results of the basic performance index set and manoeuvre performance index set are shown in Tables 6–7. Therein, CA is the weight determination method based on CAs that are produced by the ten classification methods in Subsection 5.2. In addition, the PCCs and LTCCs between the weights assigned by the eight methods and those of CAs are shown in Table 8, and the values in bold are the maximum ones. Table 8 shows that the weights obtained by FBRW are closely related to the reference weights determined by CAs.
Table 6 The weights of basic performance index set
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab7.gif?pub-status=live)
Table 7 The weights of manoeuvre performance index set
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab8.gif?pub-status=live)
Table 8 The comparison results of weight assignment
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab9.gif?pub-status=live)
For further comparative analysis, we rank the weights determined by the nine methods in descending order, and the results are shown in Tables 9–10. In order to fully compare and explain the results, we analyse the weight assignment from two aspects, i.e. the distribution characterisation of data and the practical meanings of attributes. For simplicity, we take the basic performance index set as an example for the first aspect and the manoeuvre performance index set for the second one.
Table 9 The rank-order of basic performance index weights
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab10.gif?pub-status=live)
Table 10 The rank-order of manoeuvre performance index weights
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_tab11.gif?pub-status=live)
From the results in Table 9, we can see that there are some differences between the rank-orders of the weights obtained by the above methods. In order to analyse the reasons of the above problem more accurately and clearly, we visualise the basic performance index data through the data statistical distribution produced by the generations (shown in Fig. 5). Therein, the short horizontal lines distributed at both ends of the boxes indicate the maximum and minimum values, and the boxes mean the ranges between the 25th percentile and the 75th percentile, the lines in the boxes are the median values, and the cross symbols are the abnormal data.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_fig5g.jpeg?pub-status=live)
Figure 5. The statistical distribution of each attribute data
From the results in Fig. 5 we can see that there are obvious discriminations in the distributions of C 11, C 16 and C 17, and it is just the opposite for C 12 and C 13. In other words, we can easily distinguish the fighters according to the distributions of C 11, C 16 and C 17 rather than those of C 12 and C 13. Therefore, the attributes C 11, C 16 and C 17 should be assigned greater weights, and the weights of C 12 and C 13 should be less than others’.
According to the above analyses, we can see that the results (see Table 9) produced by CRITIC, CCSD and CE have lower credibility, which can also be verified by the results in Table 8. The PCCs of CRITIC, CCSD and CE are negative, and their LTCCs are less than others’.
On the other hand, for the manoeuvre performance indexes, the attribute C 24 (specific excess power) has been recognised as the most important parameter to measure the operational effectiveness of fighters because some indexes such as stable hover performance, climb rate, longitudinal acceleration and ceiling are closely related with it(Reference Zhu, Zhu and Xiong1). Therefore, C 24 should be assigned the greatest weight. The ranking results in Table 10 show that Entropy, CRITIC, CE and NRS do not put C 24 at the leading place. The attribute C 21 (thrust-to-weight) is considered as a relatively important index for the combat effectiveness evaluation of fighter, which will directly affect the manoeuvreability of fighter(Reference Zhu, Zhu and Xiong1) and should also be assigned a greater weight. However, CRITIC, SD and CCSD assign C 21 to the least important position (shown in Table 10). Additionally, the significance of C 23 is greater than that of C 22 in the combat effectiveness evaluation of fighter, thus, the weight of C 23 should be greater than C 22’s. Therefore, only the rank-order results produced by FBRW and CA are in line with the actual situation.
In summary, the weights assigned by FBRW are more reasonable and explanatory than others’.
6.4 Effectiveness evaluation
In this subsection, we take seven fighters, i.e. MiG-29, MiG-31, F-15A, F-16A, F/A-18A, Mirage 2000-5 and Tornado, to demonstrate the combat effectiveness evaluation of the fighters, which belong to the 4th generation. In MADA, the effectiveness evaluation is a linear weighted calculation of sample data. Therefore, the effectiveness results are depicted in Figs 6-8.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_fig6g.jpeg?pub-status=live)
Figure 6. The basic effectiveness of the fighters.
From the results in Fig. 6, we can see that MiG-31 has the maximum effectiveness, i.e. 0.8276. However, it is the opposite that the manoeuvre effectiveness of MiG-31 is the minimum one in Fig. 7. This reason can be summed up as follows. Most of the basic performances of MiG-31 are better than those of the other fighters. Its manoeuvre performances, however, are so poor that they result in the lowest manoeuvre effectiveness.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_fig7g.jpeg?pub-status=live)
Figure 7. The manoeuvre effectiveness of the fighters.
On the other hand, with the help of the reasonable weights of the two index sets, the total effectiveness is more in line with the actual situation than other sub-effectiveness (see Figs 6-7). From the results in Fig. 8, we can see that the F-15A fighter has the maximum effectiveness value, i.e. 0.7882, and Tornado has the minimum one. Therefore, we can conclude two levels of the above fighters, which is consistent with the actual situation and the results in literature(Reference Zhu, Zhu and Xiong1,Reference Dong, Wang and Zhang4,Reference Wang, Zhang and Xu5) . One includes four fighters, i.e. F-15A, MiG-29, F-16A and Mirage 2000-5. The other three fighters, i.e. MiG-31, F/A-18A and Tornado, belong to the second level. With respect to F-15A and F-16A, F-15A is recognised as an outstanding fighter, and its performance is better than that of F-16A. Actually, F-16A is designed to be the partner of F-15A. The three fighters, MiG-29, F-16A and Mirage 2000-5, are considered to be the ones with the same level. With regard to these three fighters, the thrust (C 21) and maximum speed (C 211) of MiG-29 are larger than those of F-16A, but its take-off wing load (C 23) is more than F-16A’s. The power system is the “short board” of Mirage 2000-5; however, its take-off wing load and maximum speed make its combat performance comparable to that of F-16A. The design objectives of MiG-31 are high speed and strong firepower, which reduce the air combat capability of MiG-31. F/A-18A is a kind of carrier-based aircraft whose combat performance is certainly not as good as the other subgrade fighters. Compared with the other fighters, many performances of Tornado are worse than those of others, which can be seen in Tables 11–12.
![](https://static.cambridge.org/binary/version/id/urn:cambridge.org:id:binary:20180730042358774-0326:S0001924018000544:S0001924018000544_fig8g.jpeg?pub-status=live)
Figure 8. The total effectiveness of the fighters.
7.0 DISCUSSION
Based on the above amount of experiments, we provide some extension discussions as follows.
From the results of the comparison experiments we can see that, with the help of considering some relationships, i.e. the risk of decision and the fuzzy membership, between conditional attributes and the decision attribute in DS, the weights produced by FBRW are better than others’. Nevertheless, all the compared methods have their shortcomings that can be summed up as follows. (a) The Entropy, CRITIC, SD and CCSD methods do not take account of the above relationships. (b) Data-driven weight assignment technique is usually based on the assumption of attribute independent so that it extracts the weights from the aspect of statistic. However, GRA ignores the above assumption due to considering the maximum and minimum differences between all the conditional attributes and the decision attribute, which results in some unsatisfactory weights. (c) CE is greatly affected by discretisation methods; different discretisation methods generate kinds of conditional attributes’ partitions, which result in various weights. (d) The dependencies of conditional attributes with respect to the decision attribute are employed in NRS, which may generate zero weight (see Tables 4–7).
For the effectiveness evaluation of fighter, it is undeniable that fighters are optimised for certain roles, i.e. a particular aircraft may not have a high overall effectiveness but may be the most effective in a specific role. With regard to this study, it is better to gather as many fighters with the same function (role) as possible to evaluate the effectiveness of fighters. However, due to the confidentiality of fighter data, our research on effectiveness evaluation can only be based on a small number of fighters with open data, which are characterised with different roles (such as high altitude combat role, close combat role), but they are all belonging to the combat fighters. On the other hand, the index system consisting of basic performance indexes and manoeuvre performance indexes in our research is the most basic system for evaluating fighters, through which the comprehensive combat performance values of fighters can be obtained. These values can be regarded as the basic references for evaluating the performance of the fighters from the comprehensive performance perspective.
With regard to weight assignment for IS, the combination of FBRW and some clustering methods is an optional approach, where the clustering methods provide labels as the decision attribute for FBRW. Therefore, the FBRW method can be applied to both IS and DS.
8.0 CONCLUSION
In order to deal with the labelled multiple attribute decision-making issue, this paper proposes an object weight assignment method based on FBRW. Firstly, some preliminaries are presented and the FBRW method follows, where a Gaussian kernel loss function is raised to make up the deficiency of the Bayes risk. Subsequently, the problem of weight determination for a multi-layer attribute set is discussed. Then, a weight assignment algorithm based on FBRW is given. In order to evaluate the credibility of assigned weights, the LTCC model is designed. Finally, a large number of experiments are carried out which include the comparison experiments on LTCC, the comparison experiments of weight assignment using a UCI dataset, and the effectiveness evaluation of the fighter. The experimental results and discussions show that (a) LTCC is suitable for evaluating the assigned weights, and (b) the proposed FBRW method is not only good at dealing with single-layer or multi-layer DS, but can also be extended to cope with IS. Compared with other weight assignment methods, the weights produced by FBRW are more reasonable and closely related to those determined by CAs.
In future work, we will focus on combining the FBRW method with some clustering methods to deal with the weight assignment for IS and apply FBRW to other fields.
ACKNOWLEDGEMENTS
The authors thank the anonymous reviewers for their constructive comments on this study.