回归分析法(回归分析、建模和预测)

懵懂先生 投稿文章回归分析法(回归分析、建模和预测)已关闭评论74阅读模式

文章源自略懂百科-http://wswcn.cn/98018.html

Yishuo School District (33) | SPSS Statistical Analysis (43) Binary Logistic Regression Analysis文章源自略懂百科-http://wswcn.cn/98018.html

SPSS(43).mp37:22来自LearningYard学苑文章源自略懂百科-http://wswcn.cn/98018.html

分享兴趣,传播快乐,增长见闻,留下美好! 大家好,这里是小编。欢迎大家继续访问学苑内容,我们将竭诚为您带来更多更好的内容分享。文章源自略懂百科-http://wswcn.cn/98018.html

"Share interest, spread happiness, increase knowledge, and leave a good impression! Hello everyone, this is Xiaobian. Welcome to continue to visit the content of Xueyuan, and we will wholeheartedly bring you more and better content to share.文章源自略懂百科-http://wswcn.cn/98018.html

前面我们讲到的所有回归分析的变量都是定量变量,但在实际生活中,因变量既有定量的,也有定性的。因变量是定性变量的情况如医学上的阴性和阳性,生存和死亡,消费现象中的购买行为发生还是不发生,金融现象中的IPO通过还是不通过等等。文章源自略懂百科-http://wswcn.cn/98018.html

All the regression analysis variables we mentioned above are quantitative variables, but in real life, dependent variables are both quantitative and qualitative. Dependent variables are qualitative variables, such as negative and positive in medicine, survival and death, purchase behavior in consumption phenomenon or not, IPO in financial phenomenon or not, etc.文章源自略懂百科-http://wswcn.cn/98018.html

可以处理定性因变量的统计分析方法有很多,比如判别分析、Probit分析、Logistic回归分析和对数线性分析等。在社会科学中,应用最多的是Logistic回归分析。根据因变量取值类别数量不同,Logistic回归分析又分为二元Logistic回归分析和多元Logistic回归分析。二元Logistic回归模型中的因变量只可以取两个值1和0(虚拟变量)。我们用一个实例来简单介绍一下二元Logistic回归模型。文章源自略懂百科-http://wswcn.cn/98018.html

There are many statistical analysis methods that can handle qualitative dependent variables, such as discriminant analysis, probit analysis, logistic regression analysis and log linear analysis. Logistic regression analysis is most widely used in social sciences. Logistic regression analysis can be divided into binary logistic regression analysis and multivariate logistic regression analysis according to the number of categories of dependent variables. The dependent variable in the binary logistic regression model can only take two values 1 and 0 (dummy variable). Lets use an example to briefly introduce the binary logistic regression model.文章源自略懂百科-http://wswcn.cn/98018.html

诊断发现运营不良的金融企业是审计核查的一项重要功能,审计核查的分类失败会导致灾难性的后果。以下图表列出了66家公司的部分运营财务比率,其中33家在2年后破产(y=0),另外33家在同期保持偿付能力(y=1).请用变量x1(未分配利润/总资产)、x2(税前利润/总资产)、x3(销售额/总资产)拟合一个Logistic回归模型。文章源自略懂百科-http://wswcn.cn/98018.html

It is an important function of audit verification to diagnose and discover financial enterprises that are in bad operation. Failure to classify audit verification will lead to disastrous consequences. The following chart lists some operating financial ratios of 66 companies, 33 of which went bankrupt two years later (y=0), and 33 of which remained solvent in the same period (y=1). Please fit a logistic regression model with variables x1 (undistributed profits/total assets), x2 (pre tax profits/total assets), and x3 (sales/total assets).文章源自略懂百科-http://wswcn.cn/98018.html

第一步文章源自略懂百科-http://wswcn.cn/98018.html

分析并组织数据,一共有三个自变量,均是定量数据类型。而因变量是定性的,取值有两种状态(0和1),是典型的可用二元Logistic回归解决的问题。我们定义三个自变量x1,x2,x3,再定义因变量y,输入数据并保存。文章源自略懂百科-http://wswcn.cn/98018.html

The first step is to analyze and organize data. There are three independent variables, all of which are quantitative data types. The dependent variable is qualitative and has two values (0 and 1), which is a typical problem that can be solved by binary logistic regression. We define three independent variables x1, x2 and x3, and then define the dependent variable y, input data and save it.文章源自略懂百科-http://wswcn.cn/98018.html

第二步文章源自略懂百科-http://wswcn.cn/98018.html

设置二元Logistic回归分析,选择菜单分析->回归->二元Logistic,打开二元Logistic回归对话框,按下图所示进行设置。文章源自略懂百科-http://wswcn.cn/98018.html

Step 2: set binary logistic regression analysis, select the menu "Analysis ->Regression ->Binary Logistic", open the binary logistic regression dialog box, and set it as shown below.文章源自略懂百科-http://wswcn.cn/98018.html

第三步文章源自略懂百科-http://wswcn.cn/98018.html

主要结果及其分析。文章源自略懂百科-http://wswcn.cn/98018.html

下图是个案处理摘要信息,给出了数据进入模型的记录数。文章源自略懂百科-http://wswcn.cn/98018.html

The third step is the main results and analysis.文章源自略懂百科-http://wswcn.cn/98018.html

The figure below shows the summary information of case processing, showing the number of records of data entering the model.文章源自略懂百科-http://wswcn.cn/98018.html

下图是因变量的赋值表,在SPSS中,默认将二分类变量中出现次数较多的赋值为1。本例较为特殊,二分类变量的两种情况出现的次数是一样的,从表中可以看出,将两年后破产赋值为0,两年后仍有偿付能力赋值为1.文章源自略懂百科-http://wswcn.cn/98018.html

The following figure is the assignment table of dependent variables. In SPSS, by default, the variables with more occurrences in the secondary category are assigned to 1. This example is special. The two cases of the second category variable occur the same number of times. It can be seen from the table that "bankruptcy after two years" is assigned to 0, and "solvency after two years" is assigned to 1文章源自略懂百科-http://wswcn.cn/98018.html

下图是模型初始分类预测表,此时模型中不含任何自变量,只包含常数项,表格左方实际观测值,右方代表模型的预测值和正确率。此时预测所有的公司在两年后仍有偿付能力,预测的正确率是50%。文章源自略懂百科-http://wswcn.cn/98018.html

The following figure is the initial classification prediction table of the model. At this time, the model does not contain any independent variables, but only constant items. The left side of the table represents the actual observation value, and the right side represents the prediction value and accuracy of the model. At this time, it is predicted that all companies will still be solvent in two years, and the correct rate of prediction is 50%.文章源自略懂百科-http://wswcn.cn/98018.html

下面两张表给出了模型的检验结果,其中常数项系数为0.000,其显著性概率为1,可见常数项不显著。X1,x2和x3的相伴概率分别是0.000,0.000和0.094,如果以5%为置信的话,x1和x2的系数是显著的。文章源自略懂百科-http://wswcn.cn/98018.html

The following two tables show the test results of the model. The coefficient of constant term is 0.000, and its significance probability is 1. It can be seen that the constant term is not significant. The concomitant probabilities of X1, x2 and x3 are 0.000, 0.000 and 0.094 respectively. If 5% confidence is taken, the coefficients of x1 and x2 are significant.文章源自略懂百科-http://wswcn.cn/98018.html

下图是模型系数的Omnibus检验结果,共采用了三种检验方法,分别是步与步间的相对似然比检验,块(Block)间的相对似然比检验和模型间的相对似然比检验。由于本例中只有一个自变量组且采取强行进入法将所有变量纳入模型,所以三种检验方法的结果是一致的,模型具有显著的统计意义。文章源自略懂百科-http://wswcn.cn/98018.html

The following figure shows the Omnibus test results of model coefficients. Three test methods are used, namely, the relative likelihood ratio test between steps, the relative likelihood ratio test between blocks and the relative likelihood ratio test between models. Since there is only one independent variable group in this example and all variables are included in the model by forced entry, the results of the three test methods are consistent, and the model has significant statistical significance.文章源自略懂百科-http://wswcn.cn/98018.html

下图是模型情况摘要表,主要给出了对数似然值的两个决定系数,从数据上看,模型的拟合度还不错。文章源自略懂百科-http://wswcn.cn/98018.html

The following figure is a summary of the model, mainly showing the two determination coefficients of the logarithmic likelihood value. From the data point of view, the fitting degree of the model is good.文章源自略懂百科-http://wswcn.cn/98018.html

下图是模型的分类预测情况表,此时模型的预测准确率已经达到了97%。文章源自略懂百科-http://wswcn.cn/98018.html

The following figure shows the classification prediction of the model. At this time, the prediction accuracy of the model has reached 97%.文章源自略懂百科-http://wswcn.cn/98018.html

下面是Logistic模型的拟合结果。表格从左到右依次是表示变量及常数项的系数值(B)、标准误差(S.E)、瓦尔德(wald)卡方值、自由度(df)、显著性概率,Exp(B)。由于各回归系数均为正数,取相应的指数后会大于1,表示x1,x2,x3的取值越大,两年后具有偿付能力的可能性比两年后破产的可能性就越大。文章源自略懂百科-http://wswcn.cn/98018.html

The following is the fitting result of the Logistic model. From left to right, the table shows the coefficient value (B), standard error (S.E), wald chi square value, degree of freedom (df), significance probability and Exp (B) of variables and constant terms. Since all regression coefficients are positive, the corresponding index will be greater than 1, indicating that the greater the value of x1, x2, x3, the greater the possibility of "solvency in two years time" than "bankruptcy in two years time".文章源自略懂百科-http://wswcn.cn/98018.html

若预测值p的概率小于0.5,,样本被归于两年后破产组。反之,进入两年后右偿付能力组,其预测结果如下图所示,PRE_1表示预测概率值,PGR_1表示预测分类结果值。文章源自略懂百科-http://wswcn.cn/98018.html

If the probability of the predicted value p is less than 0.5, the sample is classified into the "bankruptcy after two years" group. On the contrary, enter the group of "two years later, right solvency", and the forecast results are shown in the figure below, PRE_ 1 is the predicted probability value, PGR_ 1 indicates the predicted classification result value.文章源自略懂百科-http://wswcn.cn/98018.html

下期预告:本期,我们学习了文章源自略懂百科-http://wswcn.cn/98018.html

非线性回归的实践操作。文章源自略懂百科-http://wswcn.cn/98018.html

下一期,我们将会学习文章源自略懂百科-http://wswcn.cn/98018.html

聚类和判别分析。文章源自略懂百科-http://wswcn.cn/98018.html

Preview of the next issue: In this issue, we learned the practical operation of nonlinear regression. In the next issue, we will learn about clustering and discriminant analysis.文章源自略懂百科-http://wswcn.cn/98018.html

今天的分享就到这里了文章源自略懂百科-http://wswcn.cn/98018.html

如果您对今天的文章有独特的想法文章源自略懂百科-http://wswcn.cn/98018.html

欢迎给我们留言文章源自略懂百科-http://wswcn.cn/98018.html

让我们相约明天文章源自略懂百科-http://wswcn.cn/98018.html

祝您今天过得开心快乐!文章源自略懂百科-http://wswcn.cn/98018.html

Thats all for todays sharing. If you have unique ideas about todays article, please leave us a message. Lets meet tomorrow. I wish you a happy day today!文章源自略懂百科-http://wswcn.cn/98018.html

参考资料:百度百科,《SPSS 23 统计分析实用教程》文章源自略懂百科-http://wswcn.cn/98018.html

翻译:百度翻译文章源自略懂百科-http://wswcn.cn/98018.html

本文由learningyard新学苑原创,部分文字图片来源于他处,如有侵权,请联系删除文章源自略懂百科-http://wswcn.cn/98018.html

文章源自略懂百科-http://wswcn.cn/98018.html

懵懂先生
  • 本文由 发表于 2023年3月1日 15:11:32
  • 转载请注明:http://wswcn.cn/98018.html
投稿文章

打橄榄球要几个人(打橄榄球要几个人)

平均一支NFL球队,包括进攻组、防守组、特勤组,还有板凳队员,一共50多个人,不超过60人。 比赛中每队上场比赛人数11人,分别有进攻组、防守组、特勤组区分 具体球员位置如下: 球队进攻组 1、四分卫...
投稿文章

蚝油是淡味好还是咸味好

蚝油是厨房必备的食材,它能给给菜提味提鲜,因此受到很多人喜爱,但并不是任何情况下都适合放蚝油的,放错了不仅会破坏菜的味道,还会对身体健康造成不好的影响。蚝油千万不要随便乱放,谨记3不放!为了家人健康,...
投稿文章

好文: 移动可视电话包是什么

不法分子用快递寄送危爆品、违禁品怎么破?作为快递公司,确保邮路安全是必须负担的责任。为此,顺丰速运承诺:将继续着眼大局、全心全力、不折不扣加强管理,坚决落实三个百分百等有关规定,坚决维护邮路安全,实现...
投稿文章

辣椒煸鸡怎么做

明天就五一了,要接连着放几天假(耶~),出去游玩儿的朋友想必太多。 不知道留在家的朋友,有没有摩拳擦掌,准备给家人或者朋友们备几道实实在在的硬菜呢? 鸡鸭鱼肉,鸡当然是首选! 而事实上,除了炖鸡,鸡肉...