How to choose metrics for evaluating classification results?using “OneVsRestClassifier” from sklearn in...

Can I write a book of my D&D game?

Book where aliens are selecting humans for food consumption

Difference between two quite-similar Terminal commands

How would an AI self awareness kill switch work?

What kind of hardware implements Fourier transform?

What to do if authors don't respond to my serious concerns about their paper?

How do I say "Brexit" in Latin?

What flying insects could re-enter the Earth's atmosphere from space without burning up?

insert EOF statement before the last line of file

Explain the objections to these measures against human trafficking

What's a good word to describe a public place that looks like it wouldn't be rough?

What is better: yes / no radio, or simple checkbox?

How would a Dictatorship make a country more successful?

Notes in a lick that don't fit in the scale associated with the chord

A starship is travelling at 0.9c and collides with a small rock. Will it leave a clean hole through, or will more happen?

Using only 1s, make 29 with the minimum number of digits

What does Cypher mean when he says Neo is "gonna pop"?

Why avoid shared user accounts?

Is there some relative to Dutch word "kijken" in German?

Can we use the stored gravitational potential energy of a building to produce power?

How do you funnel food off a cutting board?

A universal method for left-hand alignment of a sequence of equalities

Why Normality assumption in linear regression

What is this metal M-shaped device for?



How to choose metrics for evaluating classification results?


using “OneVsRestClassifier” from sklearn in Python to tune a customized binary classification into a multi-class classificationMicro Average vs Macro average Performance in a Multiclass classification settingHow to set weights in multi-class classification in xgboost for imbalanced data?Estimating test AUC using k-fold CV for imbalanced classification problemGood performance metrics for multiclass classification problem besides accuracy?Which method should be considered to evaluate the imbalanced multi-class classification?Metrics to measure imbalanced multi-class problemIn a binary classification, should the test dataset be balanced?inbalanced dataset with 3 classes xgboost scale_pos_weight parameterMicro-F1 and Macro-F1 are equal in binary classification and I don't know why













0












$begingroup$


Recently we have developed a python library named PyCM specialized for analyzing multi-class confusion matrices. A parameter recommender system has been added in version 1.9 of this module in order to recommend most related parameters considering the characteristics of the input dataset and its classification problem.
This new option is very challenging and raising many questions. At first, I try to explain the assumptions and describe how this module works in this part. After that, some questions are going to be asked for evaluating the performance of this recommender system.



Considered characteristics:



The characteristics according to which the parameters are suggested are as following:




  1. Classification problem type (binary or multi-class)

  2. Dataset type (balanced or imbalanced)


It should be noticed that in the case that the problem is either a binary or a multi-class classification on an imbalanced dataset, for recommending the parameters, just being imbalanced is considered. Therefore, the inspected states can be categorized into three main groups:




  1. Balanced dataset – Binary classification

  2. Balanced dataset – Multi-class classification

  3. Imbalanced dataset


The definition of being imbalanced:



Recognizing the fact that a classification problem is binary or multi-class is so easy. But the margin between being balanced or imbalanced for a dataset is not clear. In PyCM module for checking if the input dataset is balanced or not, a definition has been introduced. According to this definition, if the ratio of the population of the most populous class to the population of the most deserted class is bigger than 3, the dataset is assumed imbalanced.



Recommended parameters:



The recommendation lists have been gathered according to the respective paper of each parameter and the capabilities which had been claimed by the paper. For further information, read the document of PyCM or visit the project page.




  • Binary – Balanced recommended parameters: ACC, TPR, PPV, AUC, AUCI, TNR, F1


  • Multi-class – Balanced recommended parameters: ERR, TPR Micro, TPR Macro, PPV Micro, PPV Macro, ACC, Overall ACC, MCC, Overall MCC, BCD, Hamming Loss, Zero-one Loss


  • Imbalanced recommended parameters: Kappa, SOA1(Landis & Koch), SOA2(Fleiss), SOA3(Altman), SOA4(Cicchetti), CEN, MCEN, MCC, J, Overall J, Overall MCC, Overall CEN, Overall MCEN, AUC, AUCI, G, DP, DPI, GI



Questions:
1. Is the proposed definition of being imbalanced correct? Is there any more comprehensive definition for this characteristic?
2. Is recommending the same parameters for both binary and multi-class classification problem correct over imbalanced dataset?
3. Are the recommendation parameter lists correct and complete? Is there any other parameter for recommending?
4. Is there any other characteristics (like binary/multi-class and balanced/imbalanced) which can effect on evaluating the result of a classification method?



Website: http://www.pycm.ir/



Github: https://github.com/sepandhaghighi/pycm



Paper: https://www.theoj.org/joss-papers/joss.00729/10.21105.joss.00729.pdf









share









$endgroup$

















    0












    $begingroup$


    Recently we have developed a python library named PyCM specialized for analyzing multi-class confusion matrices. A parameter recommender system has been added in version 1.9 of this module in order to recommend most related parameters considering the characteristics of the input dataset and its classification problem.
    This new option is very challenging and raising many questions. At first, I try to explain the assumptions and describe how this module works in this part. After that, some questions are going to be asked for evaluating the performance of this recommender system.



    Considered characteristics:



    The characteristics according to which the parameters are suggested are as following:




    1. Classification problem type (binary or multi-class)

    2. Dataset type (balanced or imbalanced)


    It should be noticed that in the case that the problem is either a binary or a multi-class classification on an imbalanced dataset, for recommending the parameters, just being imbalanced is considered. Therefore, the inspected states can be categorized into three main groups:




    1. Balanced dataset – Binary classification

    2. Balanced dataset – Multi-class classification

    3. Imbalanced dataset


    The definition of being imbalanced:



    Recognizing the fact that a classification problem is binary or multi-class is so easy. But the margin between being balanced or imbalanced for a dataset is not clear. In PyCM module for checking if the input dataset is balanced or not, a definition has been introduced. According to this definition, if the ratio of the population of the most populous class to the population of the most deserted class is bigger than 3, the dataset is assumed imbalanced.



    Recommended parameters:



    The recommendation lists have been gathered according to the respective paper of each parameter and the capabilities which had been claimed by the paper. For further information, read the document of PyCM or visit the project page.




    • Binary – Balanced recommended parameters: ACC, TPR, PPV, AUC, AUCI, TNR, F1


    • Multi-class – Balanced recommended parameters: ERR, TPR Micro, TPR Macro, PPV Micro, PPV Macro, ACC, Overall ACC, MCC, Overall MCC, BCD, Hamming Loss, Zero-one Loss


    • Imbalanced recommended parameters: Kappa, SOA1(Landis & Koch), SOA2(Fleiss), SOA3(Altman), SOA4(Cicchetti), CEN, MCEN, MCC, J, Overall J, Overall MCC, Overall CEN, Overall MCEN, AUC, AUCI, G, DP, DPI, GI



    Questions:
    1. Is the proposed definition of being imbalanced correct? Is there any more comprehensive definition for this characteristic?
    2. Is recommending the same parameters for both binary and multi-class classification problem correct over imbalanced dataset?
    3. Are the recommendation parameter lists correct and complete? Is there any other parameter for recommending?
    4. Is there any other characteristics (like binary/multi-class and balanced/imbalanced) which can effect on evaluating the result of a classification method?



    Website: http://www.pycm.ir/



    Github: https://github.com/sepandhaghighi/pycm



    Paper: https://www.theoj.org/joss-papers/joss.00729/10.21105.joss.00729.pdf









    share









    $endgroup$















      0












      0








      0





      $begingroup$


      Recently we have developed a python library named PyCM specialized for analyzing multi-class confusion matrices. A parameter recommender system has been added in version 1.9 of this module in order to recommend most related parameters considering the characteristics of the input dataset and its classification problem.
      This new option is very challenging and raising many questions. At first, I try to explain the assumptions and describe how this module works in this part. After that, some questions are going to be asked for evaluating the performance of this recommender system.



      Considered characteristics:



      The characteristics according to which the parameters are suggested are as following:




      1. Classification problem type (binary or multi-class)

      2. Dataset type (balanced or imbalanced)


      It should be noticed that in the case that the problem is either a binary or a multi-class classification on an imbalanced dataset, for recommending the parameters, just being imbalanced is considered. Therefore, the inspected states can be categorized into three main groups:




      1. Balanced dataset – Binary classification

      2. Balanced dataset – Multi-class classification

      3. Imbalanced dataset


      The definition of being imbalanced:



      Recognizing the fact that a classification problem is binary or multi-class is so easy. But the margin between being balanced or imbalanced for a dataset is not clear. In PyCM module for checking if the input dataset is balanced or not, a definition has been introduced. According to this definition, if the ratio of the population of the most populous class to the population of the most deserted class is bigger than 3, the dataset is assumed imbalanced.



      Recommended parameters:



      The recommendation lists have been gathered according to the respective paper of each parameter and the capabilities which had been claimed by the paper. For further information, read the document of PyCM or visit the project page.




      • Binary – Balanced recommended parameters: ACC, TPR, PPV, AUC, AUCI, TNR, F1


      • Multi-class – Balanced recommended parameters: ERR, TPR Micro, TPR Macro, PPV Micro, PPV Macro, ACC, Overall ACC, MCC, Overall MCC, BCD, Hamming Loss, Zero-one Loss


      • Imbalanced recommended parameters: Kappa, SOA1(Landis & Koch), SOA2(Fleiss), SOA3(Altman), SOA4(Cicchetti), CEN, MCEN, MCC, J, Overall J, Overall MCC, Overall CEN, Overall MCEN, AUC, AUCI, G, DP, DPI, GI



      Questions:
      1. Is the proposed definition of being imbalanced correct? Is there any more comprehensive definition for this characteristic?
      2. Is recommending the same parameters for both binary and multi-class classification problem correct over imbalanced dataset?
      3. Are the recommendation parameter lists correct and complete? Is there any other parameter for recommending?
      4. Is there any other characteristics (like binary/multi-class and balanced/imbalanced) which can effect on evaluating the result of a classification method?



      Website: http://www.pycm.ir/



      Github: https://github.com/sepandhaghighi/pycm



      Paper: https://www.theoj.org/joss-papers/joss.00729/10.21105.joss.00729.pdf









      share









      $endgroup$




      Recently we have developed a python library named PyCM specialized for analyzing multi-class confusion matrices. A parameter recommender system has been added in version 1.9 of this module in order to recommend most related parameters considering the characteristics of the input dataset and its classification problem.
      This new option is very challenging and raising many questions. At first, I try to explain the assumptions and describe how this module works in this part. After that, some questions are going to be asked for evaluating the performance of this recommender system.



      Considered characteristics:



      The characteristics according to which the parameters are suggested are as following:




      1. Classification problem type (binary or multi-class)

      2. Dataset type (balanced or imbalanced)


      It should be noticed that in the case that the problem is either a binary or a multi-class classification on an imbalanced dataset, for recommending the parameters, just being imbalanced is considered. Therefore, the inspected states can be categorized into three main groups:




      1. Balanced dataset – Binary classification

      2. Balanced dataset – Multi-class classification

      3. Imbalanced dataset


      The definition of being imbalanced:



      Recognizing the fact that a classification problem is binary or multi-class is so easy. But the margin between being balanced or imbalanced for a dataset is not clear. In PyCM module for checking if the input dataset is balanced or not, a definition has been introduced. According to this definition, if the ratio of the population of the most populous class to the population of the most deserted class is bigger than 3, the dataset is assumed imbalanced.



      Recommended parameters:



      The recommendation lists have been gathered according to the respective paper of each parameter and the capabilities which had been claimed by the paper. For further information, read the document of PyCM or visit the project page.




      • Binary – Balanced recommended parameters: ACC, TPR, PPV, AUC, AUCI, TNR, F1


      • Multi-class – Balanced recommended parameters: ERR, TPR Micro, TPR Macro, PPV Micro, PPV Macro, ACC, Overall ACC, MCC, Overall MCC, BCD, Hamming Loss, Zero-one Loss


      • Imbalanced recommended parameters: Kappa, SOA1(Landis & Koch), SOA2(Fleiss), SOA3(Altman), SOA4(Cicchetti), CEN, MCEN, MCC, J, Overall J, Overall MCC, Overall CEN, Overall MCEN, AUC, AUCI, G, DP, DPI, GI



      Questions:
      1. Is the proposed definition of being imbalanced correct? Is there any more comprehensive definition for this characteristic?
      2. Is recommending the same parameters for both binary and multi-class classification problem correct over imbalanced dataset?
      3. Are the recommendation parameter lists correct and complete? Is there any other parameter for recommending?
      4. Is there any other characteristics (like binary/multi-class and balanced/imbalanced) which can effect on evaluating the result of a classification method?



      Website: http://www.pycm.ir/



      Github: https://github.com/sepandhaghighi/pycm



      Paper: https://www.theoj.org/joss-papers/joss.00729/10.21105.joss.00729.pdf







      machine-learning python classification multiclass-classification confusion-matrix





      share












      share










      share



      share










      asked 3 mins ago









      alireza zolanvarialireza zolanvari

      544




      544






















          0






          active

          oldest

          votes











          Your Answer





          StackExchange.ifUsing("editor", function () {
          return StackExchange.using("mathjaxEditing", function () {
          StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
          StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
          });
          });
          }, "mathjax-editing");

          StackExchange.ready(function() {
          var channelOptions = {
          tags: "".split(" "),
          id: "557"
          };
          initTagRenderer("".split(" "), "".split(" "), channelOptions);

          StackExchange.using("externalEditor", function() {
          // Have to fire editor after snippets, if snippets enabled
          if (StackExchange.settings.snippets.snippetsEnabled) {
          StackExchange.using("snippets", function() {
          createEditor();
          });
          }
          else {
          createEditor();
          }
          });

          function createEditor() {
          StackExchange.prepareEditor({
          heartbeatType: 'answer',
          autoActivateHeartbeat: false,
          convertImagesToLinks: false,
          noModals: true,
          showLowRepImageUploadWarning: true,
          reputationToPostImages: null,
          bindNavPrevention: true,
          postfix: "",
          imageUploader: {
          brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
          contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
          allowUrls: true
          },
          onDemand: true,
          discardSelector: ".discard-answer"
          ,immediatelyShowMarkdownHelp:true
          });


          }
          });














          draft saved

          draft discarded


















          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46509%2fhow-to-choose-metrics-for-evaluating-classification-results%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown

























          0






          active

          oldest

          votes








          0






          active

          oldest

          votes









          active

          oldest

          votes






          active

          oldest

          votes
















          draft saved

          draft discarded




















































          Thanks for contributing an answer to Data Science Stack Exchange!


          • Please be sure to answer the question. Provide details and share your research!

          But avoid



          • Asking for help, clarification, or responding to other answers.

          • Making statements based on opinion; back them up with references or personal experience.


          Use MathJax to format equations. MathJax reference.


          To learn more, see our tips on writing great answers.




          draft saved


          draft discarded














          StackExchange.ready(
          function () {
          StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fdatascience.stackexchange.com%2fquestions%2f46509%2fhow-to-choose-metrics-for-evaluating-classification-results%23new-answer', 'question_page');
          }
          );

          Post as a guest















          Required, but never shown





















































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown

































          Required, but never shown














          Required, but never shown












          Required, but never shown







          Required, but never shown







          Popular posts from this blog

          is 'sed' thread safeWhat should someone know about using Python scripts in the shell?Nexenta bash script uses...

          How do i solve the “ No module named 'mlxtend' ” issue on Jupyter?

          Pilgersdorf Inhaltsverzeichnis Geografie | Geschichte | Bevölkerungsentwicklung | Politik | Kultur...