Skip to main content


Showing posts from August, 2021

To log or how to log

I avoid posting technical notes here. This is an exception because I have an agenda. Log transformation is widely used in modeling data for several reasons: Making data "behave," calculating elasticity etc. When an outcome variable naturally has zeros, however, log transformation is tricky. Many data modelers (including seasoned researchers) instinctively add a positive constant to each value in the outcome variable. One popular idea is to add 1 to the variable and transform raw zeros to log-transformed zeros. Another idea is to add a very small constant, especially when the scale of the outcome variable is small. Well, bad news is these are arbitrary choices and the resulting estimations may be biased. To me, if an analysis is correlational (as most are), a small bias may not be a big concern. If it is causal, and for example, an estimated elasticity will be used to take action (with an intention to change an outcome), that's trouble waiting to happen. This is a problem