How can I avoid catastrophic failure of precision in Hessian calculations?

I am receiving the error message "Hessian Calculation failed." I believe this may be due to failure of precision. Is there a solution to this problem?

 

2 Answers



0



accepted

There are a number of possible solutions to these problems:

1. Increasing Precision.  The first task is to work to remove large numbers from the calculations in the procedure computing the log-likelihood or objective function.  Often all that’s needed is to scale the data.  After that inspect the procedures to see of any operations can be simplified to prevent a loss of precision.  For example, disaggregate the calculation.  Instead of

 a(b + c)

try this

 ab + ac

2. Using Analytical Derivatives.  Analytical derivatives avoid the problems associated with numerical derivatives.  The latest versions of Constrained Optimization MT (COMT) and Constrained Maximum Likelihood MT (CMLMT) do not require calculating all of the derivatives analytically.  You can pick out the ones with the greatest potential for loss of precision and the remaining will be computed numerically.  Also, when analytical derivatives are available, the program will compute the Hessian using the analytical derivatives increasing their precision as well.

admin

47


0



Other reasons for the failure to invert the Hessian (which is required for the covariance matrix of the parameters) is that there isn't enough information in the data to identify one or more parameters.  This happens when the Hessian is very ill-conditioned.  The log to the base ten of the condition number (log(cond(out.hessian))) is approximately the number of decimal places lost in computing the inverse.  We start with 16 places and about 8 places are lost with a numerically calculated Hessian (the usual situation), and thus a log condition number of 8 or more is catastrophic.  This situation could be significantly improved by providing a procedure for computing the Hessian analytically.  This can be difficult.  Otherwise you are left with providing more data with information about the ill-identified parameters or a model without them.

But even a well-conditioned problem can be brought to its knees through losses of precision caused by ill-advised methods of calculation.   The main source of degradation is mixing small numbers with large numbers.  And this is seriously aggravated by subtraction and division.  Both of these issues arise in computing numerical derivatives which involve subtracting two numbers very close to each other and dividing by a very small number.

I don't want to get into too much detail here but let me say this much.  Computer numbers are stored as an abscissa and an exponent.  All of the accuracy is in the abscissa, none in the exponent.  When one number is subtracted from another, their exponents must be made equal.  In the process, elements of the abscissa fall off the end losing precision.  When the two numbers are very different in magnitude, large parts of the abscissa can be lost.  So the number one objective is to scale your data so all of the numbers in the calculations are as close together in magnitude as possible.

When your calculations include exponentials and powers, keeping the numbers under control can be difficult.  It will be very important to keep their arguments small.  Even a moderately sized number can quickly turn huge as an exponential.

An important tactic is to disaggregate.   Turn a*(b + c) to a*b + a*c.   Also turn (a*b*c*d) / (e*f*g*h) into (a/e)*(b/f)*(c/g)*(d/h), while working to make each of the numerators to be as close in size to their denominators as possible.

But the number one tactic is scale your data.

Your Answer

2 Answers

0
accepted

There are a number of possible solutions to these problems:

1. Increasing Precision.  The first task is to work to remove large numbers from the calculations in the procedure computing the log-likelihood or objective function.  Often all that’s needed is to scale the data.  After that inspect the procedures to see of any operations can be simplified to prevent a loss of precision.  For example, disaggregate the calculation.  Instead of

 a(b + c)

try this

 ab + ac

2. Using Analytical Derivatives.  Analytical derivatives avoid the problems associated with numerical derivatives.  The latest versions of Constrained Optimization MT (COMT) and Constrained Maximum Likelihood MT (CMLMT) do not require calculating all of the derivatives analytically.  You can pick out the ones with the greatest potential for loss of precision and the remaining will be computed numerically.  Also, when analytical derivatives are available, the program will compute the Hessian using the analytical derivatives increasing their precision as well.

0

Other reasons for the failure to invert the Hessian (which is required for the covariance matrix of the parameters) is that there isn't enough information in the data to identify one or more parameters.  This happens when the Hessian is very ill-conditioned.  The log to the base ten of the condition number (log(cond(out.hessian))) is approximately the number of decimal places lost in computing the inverse.  We start with 16 places and about 8 places are lost with a numerically calculated Hessian (the usual situation), and thus a log condition number of 8 or more is catastrophic.  This situation could be significantly improved by providing a procedure for computing the Hessian analytically.  This can be difficult.  Otherwise you are left with providing more data with information about the ill-identified parameters or a model without them.

But even a well-conditioned problem can be brought to its knees through losses of precision caused by ill-advised methods of calculation.   The main source of degradation is mixing small numbers with large numbers.  And this is seriously aggravated by subtraction and division.  Both of these issues arise in computing numerical derivatives which involve subtracting two numbers very close to each other and dividing by a very small number.

I don't want to get into too much detail here but let me say this much.  Computer numbers are stored as an abscissa and an exponent.  All of the accuracy is in the abscissa, none in the exponent.  When one number is subtracted from another, their exponents must be made equal.  In the process, elements of the abscissa fall off the end losing precision.  When the two numbers are very different in magnitude, large parts of the abscissa can be lost.  So the number one objective is to scale your data so all of the numbers in the calculations are as close together in magnitude as possible.

When your calculations include exponentials and powers, keeping the numbers under control can be difficult.  It will be very important to keep their arguments small.  Even a moderately sized number can quickly turn huge as an exponential.

An important tactic is to disaggregate.   Turn a*(b + c) to a*b + a*c.   Also turn (a*b*c*d) / (e*f*g*h) into (a/e)*(b/f)*(c/g)*(d/h), while working to make each of the numerators to be as close in size to their denominators as possible.

But the number one tactic is scale your data.


You must login to post answers.

Have a Specific Question?

Get a real answer from a real person

Need Support?

Get help from our friendly experts.