NLP Digest, Vol 12, Issue 5

wren ng thornton wren at
Sun Oct 30 01:20:11 BST 2011

On 10/29/11 7:04 PM, Dan Maftei wrote:
>> I mean the functions (\x ->  log (1+x)) and (\x ->  exp x - 1). They're
>> inaccurate for very small x because of details about how Float/Double
>> are represented. The details aren't too important, but just think about
>> what happens when we add a really small number to 1. There are many
>> values where 1+x == 1 because we can't represent 1+x with the available
>> precision and 1 is the closest thing we can represent.
>> Suffice it to say that (\x ->  log (1+x)) will underflow around x=2**-53
>> for Double, whereas we could implement a denotationally equivalent
>> function (\x ->  log1p x) which only underflows around x=2**-1074 for
>> Double. The same idea goes for the naive (\x ->  exp x - 1) vs a
>> denotationally equivalent implementation (\x ->  expm1 x).
>> N.B., (log1p . expm1) == id, and (expm1 . log1p) == id
> Perfect. That makes sense, and I tested on ghci too.
> But why is adding 1 to a small number then taking its log so common? When
> we move into logspace, we take logs of small numbers, and add or multiply
> them. Where does (\x ->  log (1+x)) come into play?

Well, consider the implementation of logSum. Naively we could define it as:

     (+) (LogFloat x) (LogFloat y) = LogFloat (log (exp x + exp y))

And in mathematics that's fine; however, when using floating point 
instead of actual real numbers, the exponentiation will cause underflow 
in all the places we're trying to avoid it. Thus, the naive 
implementation doesn't achieve our goals. Through some manipulation we 
can come up with the following implementation which avoids underflow:

     (+) (LogFloat x) (LogFloat y)
         | x == y
           && isInfinite x
           && isInfinite y = LogFloat x
         | x >= y          = LogFloat (x + log1p (exp (y - x)))
         | otherwise       = LogFloat (y + log1p (exp (x - y)))

The first case says that @0+0 == 0@ and @infinity+infinity == infinity@ 
also hold in the log-domain (where log 0 = -inf and log +inf = +inf). 
The second two cases are the same as each other, just ensuring that we 
start from the larger argument.

To understand this equation in the normal domain we must exponentiate it.

     exp (x + log (1 + exp (y - x)))
   = exp x * exp (log (1 + exp (y - x)))
   = exp x * (1 + exp (y - x))
   = exp x + exp x * exp (y - x)
   = exp x + exp (x + y - x)
   = exp x + exp y

So, it does actually succeed in adding two log-domain numbers. But why 
do it this way? Since we can "factor x out of y" before doing the 
exponentiation, that ensures that the thing we're exponentiating will be 
smaller than either argument (provided they have the same sign). By 
pushing things towards zero we're pushing things away from the boundary 
where exp will underflow/overflow. As for why we want x>y, I don't 
recall the details of that so I'll have to defer to an IEEE-754 guru.

Another reason why they're so important is because crossing the log/exp 
border is extremely expensive. In the naive implementation we cross that 
border thrice (2 exp & 1 log), whereas in the modified version we only 
need to do it twice (1 exp & 1 log1p). Thus, we've reduced the running 
time of logSum by about 33%! Because 1 plays such a special role with 
respect to the field operators, we can make use of log1p, expm1, and 
some basic algebra in order to make programs dealing with 
logarithms/exponentiation much much faster. The fact that these 
functions also have special implementations which are very precise is 
icing on the cake.

Live well,

More information about the NLP mailing list