Just another site

Archive for the category “Probability/statistics/information”

A perceptron for finding a hyper-exponential distribution.

Recently I have been looking at some data, jitter data for spike trains, which may have a hyper-exponential distribution:

p(t)=\sum p_i s_ie^{-s_i t}

The idea is that there is a probability p_i of event type i which is in turn exponentially distributed with rate s_i. The p_is sum to one. It is hard, even with a ton of data, to fit the parameters, I thought I might try using a perceptron as a way to do this. I started by changing the sum to an integral so

p(t)=\int_0^\infty f(s) s e^{-st} ds

which looks a lot like a Laplace transform, though it is hard to know what to do with that. Now, this means

P(t_1\le t \le t_2)=\int_0^\infty f(s) \left[e^{-st_2}-e^{-st_1}\right] ds

I imagine a situation where f(x) is compactly supported and can be sensibly discretized f_i=f(s_i). Thinking of the stuff in square brackets as input at the input nodes and the corresponding f_i as weights, the predicted P(t_1\le t\le t_2) is the output. The corresponding data values were found by interpolation with the (a-1)th and (b+1)th points and

p=P(t_1\le t\le t_2) = \sum f_i \left[e^{-s_it_2}-e^{-s_it_1}\right] \delta s

was calculated. The error is now


with n the number of points. The learning rule is applied

f_i\leftarrow f_i-\eta E \left[e^{-s_it_2}-e^{-s_it_1}\right] \delta s

Evolve until happy.

It didn't work really, starting with some known $f(s)$ it evolves until the error is small and the predicted $p(t)$ looks a lot like the real one, but $f(s)$ doesn't look much like the input. The lesson seems to be that there are lots of ways to produce more-or-less the same distribution.

The code is at


The Q-Q plot

So a Q-Q plot is a way of comparing two probability distributions:

For different quantile values you plot the value of one distribution against the other. For sampled data this means if the two samples have the same sizes, you sort them and then plot the ith entry of one against the ith entry of the other, if one sample is a different size to the other, you need to do some sort of interpolation, in the simple C++ code below the quantiles are defined by the smaller sample and intepolation is used to find the corresponding value of the data with the larger sample. Obviously if two samples come from the same distribution, the points should more or less lie on the x=y line. If one is drawn from a distribution linear related to another, they will still be in a line, but not x=y.

I don’t really get it, I can see that it is a good visual aid and someone experienced with using these plots might find them very informative, but think about making from the Q-Q plot something quantitative and I assume you might as well do the Mann-Witney-Willcoxon U-test.

I have some code called Q-Q_plot.cpp at:

How to do Kernel Density Estimation for a density with compact support!

Normalizing the kernels that overlap the edges, or reflecting them at the edge, are both good approaches; there are even better ones involving new kernels for the edges.

Post Navigation