NetNews Usenet Archive 1993 #3

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1993 #3 / NN_1993_3.iso / spool / comp / speech / 510 < prev next >

Wrap

Text File | 1993-01-29 | 6.1 KB | 151 lines

Newsgroups: comp.speech Path: sparky!uunet!seas.gwu.edu!marshall From: marshall@seas.gwu.edu (Christopher Marshall) Subject: Re: Null Transitions in HMMs Message-ID: <1993Jan28.231514.24705@seas.gwu.edu> Sender: news@seas.gwu.edu Organization: George Washington University References: <1jn78mINNbhm@manuel.anu.edu.au> <1993Jan24.232204.7521@seas.gwu.edu> <1k4ft4INNmer@manuel.anu.edu.au> Date: Thu, 28 Jan 1993 23:15:14 GMT Lines: 139 >I normally assmume that for any i and j the sum of B(i,j,k) over all k >is 1. This doesn't really matter if A(i,j) is 0 but actually gives >rise to a problem later in your argument. >> >> case 1 if i!=S1: >> A'(i,j)= A(i,j) >> B'(i,j,k)= B(i,j,k) >> > >I assume you meant: case 1 if i!=S1 and j!=S2 otherwise you neglect >the case A(i,S2), which will change if A(i,S1) is non zero. No, I meant i!=S1. We are banishing null transitions from the HMM by redefining what is means to go from state i to state j. The way I am redefining it is,"Starting in state i, keep changing state until a symbol is output, STOP. Now you are in state j." Hence, A(i,S2) need not consider the path i->S1->S2, because were this path taken, the symbol would be output on the i->S1 transition. In other words, a valid transition path must end in a non-null transition, the way I have defined things, and i->S1->S2 ends in a null transition so I need not consider it. To be precise about this, we need to go back to the Xt and Yt variables, since those are what we are really redefining. Let A(i,j), B(i,j,k), Xt, Yt be associated with HMM1 that has a null transition from S1 to S2. Let A'(i,j), B'(i,j,k), X't, Y't be associated with HMM2 that we are constructing from HMM1 by removing the null transition from S1 to S2. Note that A(i,j) B(i,j,k) and A'(i,j) and B'(i,j,k) have the same definitions in terms of their respective prime or non-prime X and Y variables, namely: A(i,j)= P(Xt+1= j | Xt= i) B(i,j,k)= P(Yt= k | Xt= i, Xt+1= j) A'(i,j)= P(X't+1= j | X't= i) B'(i,j,k)= P(Y't+1= k | X't= i, X't+1= j) Now, let 0 be the symbol used to denote a null transition. Assume HMM1 outputs the sequence ABA0BA. We have the following alignment of state and output variables with the sequence: Y1 Y2 Y3 Y4 Y5 Y6 Output variable sequence. A B A 0 B A Output symbol sequence with null transition. X1 X2 X3 X4 X5 X6 X7 State variable sequence. Now, in terms of the prime variables Y'1 Y'2 Y'3 Y'4 Y'5 Output variable sequence. A B A 0 B A Output symbol sequence with null transition. X'1 X'2 X'3 X'4 X'5 X'6 State variable sequence. In laying this out, I have decided to consider 0 as an output symbol. B(S1,S2,0)= 1, B(S1,S2,k)= 0 for k!=0. This no longer violates the sum to one rule. >> case 2 if i=S1 and j!=S2: >> This is the tricky case. When considering transitions from S1 to non-S2 >> states, you have to consider the possibility that the HMM will go from >> S1 to S2 to j, emmiting a symbol on the s2 to j transition AND the >> possibility that the HMM will go directly from S1 to j. >> A'(S1,j)= A(S1,j) + A(S1,S2)*A(S2,j) >> B'(S1,j,k)= A(S1,j)*B(S1,j,k) + A(S1,S2)*A(S2,j)*B(S2,j,k) >> ----------------------------------------------- >> A(S1,j) + A(S1,S2)*A(S2,j) > >I think you are making A and B dependant in this equation. B is the >probability of an output symbol given that a transition from i to j >does occur. Otherwise the standard update rules would violate the B >summing to 1 rule (see above). Say the probability A(i,j) was changed >by the update rule. This should have no effect on the distribution of >symbols given that the transition does occur. Yes, but the 'transition i to j' means something different depending on whether you are talking about HMM1 or HMM2. In the case of HMM1 there is only one path from i to j that outputs a symbol, in the case of HMM2 there are two paths that output symbols. > >also - you assign probabilities to null transitions. This is fine but >I think it is unnecessary How can you not assign a transition probability to a null transition? What would it mean for there to be a null transition between S1 and S2 if there was no probability that such a transition occured? >> Now, what about an HMM with more than one null transition? As long >> as no two null transitions touch head to tail (so that you could make two >> or more transitions in a row and not output a symbol), then this procedure >> can be applied once for each null transition and it will work correctly. > >Why exclude the possability of null transitions touching head to tail? >Or even of loops of null transitions? As long as you get away from the >time step notion and go to one that is output stepped these do not >pose any problem. I don't know how to convert from HMM1 to HMM2 in those cases. >I have always viewed null transitions as convenience connections. They >are a shorthand way of connecting lots of states. They don't really >change the mathematics. If you make them make real changes then you'll >have to go through the rigmarole of re-proving the HMM results with >them in place. I think that some people do make them make real changes and do go through the rigmarole of re-proving the HMM results. I don't know why though. I wish I did :-) > >In my equations I have not worried about normalising. This is because >adding a connection always stuffs up the normalisation. If you want to >re-normalise then this is easy. You know that the sum of the >probabilities of leaving a node must be 1 so divide by the sum. This >means we get: > >A''(i,j) = A'(i,j) / {sum over j of A'(i,j)} > >similarly: > >B''(i,j,k) = B'(i,j,k) / {sum over k of B'(i,j,k)} > >Note that I allow B(i,j,k) to be non-zero even if A(i,j) is zero. I >don't store these in practice, of course, but they are there in >theory. In practice the equations look nothing like the ones above. On >a real computer you need to use a log representation and in my case I >use a log integer representation. Does anyone else use a log integer >form? > >Andrew In practice, the equations have to be modified to keep the intermediate calculations from going to zero exponentially in the sequence length. Chris Marshall marshall@seas.gwu.edu