When we would like to create a decision tree we can choose from many possibilities, but formatting the tree is not always that easy. I will show a way to create, format a tree and list the rules of the inner nodes. Being able to read this rules can be very useful by an analysis. I will use the partykit package. Besides that we will need the data.table package by listing the rules of the inner nodes and optionally the xlsx package if we want to save these rules in an xls.
There are different selection methods for multiple linear regression to test the predicting variables, in order to increase the efficiency of our analysis. Variable selection is a contested topic but in some cases it can be useful. I am going to take a review of four common methods (enter, forward selection, backward elimination, stepwise selection) and check the differences in the results with bootstrapping (number of incidences: 100). Continue reading “Variable selection for multiple linear regression”
With the ReporteRs package we can automate our reports and save a lot of time and avoid errors that can occur by manual reporting. Of course there are other possibilities to create reports too, for example with knitr, but with ReporteRs we can create great PowerPoint or Word reports.
An interesting question is what kind of correlation matrices are possible with three variables (A, B and C). If we know, that there is a correlation between A and B as well as B and C, what kind of correlation can occur between A and C. What are the possible maximum values of the correlation between A and B and between B and C, when the correlation between A and C variables is null? Continue reading “Possible correlation matrices (3 variables)”