home *** CD-ROM | disk | FTP | other *** search
- Newsgroups: comp.speech
- Path: sparky!uunet!seas.gwu.edu!marshall
- From: marshall@seas.gwu.edu (Christopher Marshall)
- Subject: Re: Null Transitions in HMMs
- Message-ID: <1993Jan28.231514.24705@seas.gwu.edu>
- Sender: news@seas.gwu.edu
- Organization: George Washington University
- References: <1jn78mINNbhm@manuel.anu.edu.au> <1993Jan24.232204.7521@seas.gwu.edu> <1k4ft4INNmer@manuel.anu.edu.au>
- Date: Thu, 28 Jan 1993 23:15:14 GMT
- Lines: 139
-
- >I normally assmume that for any i and j the sum of B(i,j,k) over all k
- >is 1. This doesn't really matter if A(i,j) is 0 but actually gives
- >rise to a problem later in your argument.
- >>
- >> case 1 if i!=S1:
- >> A'(i,j)= A(i,j)
- >> B'(i,j,k)= B(i,j,k)
- >>
- >
- >I assume you meant: case 1 if i!=S1 and j!=S2 otherwise you neglect
- >the case A(i,S2), which will change if A(i,S1) is non zero.
-
- No, I meant i!=S1.
-
- We are banishing null transitions from the HMM by redefining
- what is means to go from state i to state j. The way I am redefining
- it is,"Starting in state i, keep changing state until a symbol is
- output, STOP. Now you are in state j." Hence, A(i,S2) need not
- consider the path i->S1->S2, because were this path taken, the symbol
- would be output on the i->S1 transition. In other words, a valid
- transition path must end in a non-null transition, the way I have
- defined things, and i->S1->S2 ends in a null transition so I need not
- consider it.
-
- To be precise about this, we need to go back to the Xt and Yt variables,
- since those are what we are really redefining.
-
- Let A(i,j), B(i,j,k), Xt, Yt be associated with HMM1 that has
- a null transition from S1 to S2.
-
- Let A'(i,j), B'(i,j,k), X't, Y't be associated with HMM2 that we
- are constructing from HMM1 by removing the null transition from S1 to S2.
-
- Note that A(i,j) B(i,j,k) and A'(i,j) and B'(i,j,k) have the same definitions
- in terms of their respective prime or non-prime X and Y variables, namely:
-
- A(i,j)= P(Xt+1= j | Xt= i)
- B(i,j,k)= P(Yt= k | Xt= i, Xt+1= j)
-
- A'(i,j)= P(X't+1= j | X't= i)
- B'(i,j,k)= P(Y't+1= k | X't= i, X't+1= j)
-
- Now, let 0 be the symbol used to denote a null transition.
- Assume HMM1 outputs the sequence ABA0BA. We have the following alignment
- of state and output variables with the sequence:
-
- Y1 Y2 Y3 Y4 Y5 Y6 Output variable sequence.
- A B A 0 B A Output symbol sequence with null transition.
- X1 X2 X3 X4 X5 X6 X7 State variable sequence.
-
- Now, in terms of the prime variables
-
- Y'1 Y'2 Y'3 Y'4 Y'5 Output variable sequence.
- A B A 0 B A Output symbol sequence with null transition.
- X'1 X'2 X'3 X'4 X'5 X'6 State variable sequence.
-
- In laying this out, I have decided to consider 0 as an output symbol.
- B(S1,S2,0)= 1, B(S1,S2,k)= 0 for k!=0. This no longer violates the
- sum to one rule.
-
- >> case 2 if i=S1 and j!=S2:
- >> This is the tricky case. When considering transitions from S1 to non-S2
- >> states, you have to consider the possibility that the HMM will go from
- >> S1 to S2 to j, emmiting a symbol on the s2 to j transition AND the
- >> possibility that the HMM will go directly from S1 to j.
- >> A'(S1,j)= A(S1,j) + A(S1,S2)*A(S2,j)
- >> B'(S1,j,k)= A(S1,j)*B(S1,j,k) + A(S1,S2)*A(S2,j)*B(S2,j,k)
- >> -----------------------------------------------
- >> A(S1,j) + A(S1,S2)*A(S2,j)
- >
- >I think you are making A and B dependant in this equation. B is the
- >probability of an output symbol given that a transition from i to j
- >does occur. Otherwise the standard update rules would violate the B
- >summing to 1 rule (see above). Say the probability A(i,j) was changed
- >by the update rule. This should have no effect on the distribution of
- >symbols given that the transition does occur.
-
- Yes, but the 'transition i to j' means something different depending
- on whether you are talking about HMM1 or HMM2. In the case of HMM1
- there is only one path from i to j that outputs a symbol, in the
- case of HMM2 there are two paths that output symbols.
-
- >
- >also - you assign probabilities to null transitions. This is fine but
- >I think it is unnecessary
-
- How can you not assign a transition probability to a null transition?
- What would it mean for there to be a null transition between S1 and S2
- if there was no probability that such a transition occured?
-
- >> Now, what about an HMM with more than one null transition? As long
- >> as no two null transitions touch head to tail (so that you could make two
- >> or more transitions in a row and not output a symbol), then this procedure
- >> can be applied once for each null transition and it will work correctly.
- >
- >Why exclude the possability of null transitions touching head to tail?
- >Or even of loops of null transitions? As long as you get away from the
- >time step notion and go to one that is output stepped these do not
- >pose any problem.
-
- I don't know how to convert from HMM1 to HMM2 in those cases.
-
- >I have always viewed null transitions as convenience connections. They
- >are a shorthand way of connecting lots of states. They don't really
- >change the mathematics. If you make them make real changes then you'll
- >have to go through the rigmarole of re-proving the HMM results with
- >them in place.
- I think that some people do make them make real changes and do go through
- the rigmarole of re-proving the HMM results. I don't know why though.
- I wish I did :-)
-
- >
- >In my equations I have not worried about normalising. This is because
- >adding a connection always stuffs up the normalisation. If you want to
- >re-normalise then this is easy. You know that the sum of the
- >probabilities of leaving a node must be 1 so divide by the sum. This
- >means we get:
- >
- >A''(i,j) = A'(i,j) / {sum over j of A'(i,j)}
- >
- >similarly:
- >
- >B''(i,j,k) = B'(i,j,k) / {sum over k of B'(i,j,k)}
- >
- >Note that I allow B(i,j,k) to be non-zero even if A(i,j) is zero. I
- >don't store these in practice, of course, but they are there in
- >theory. In practice the equations look nothing like the ones above. On
- >a real computer you need to use a log representation and in my case I
- >use a log integer representation. Does anyone else use a log integer
- >form?
- >
- >Andrew
-
- In practice, the equations have to be modified to keep
- the intermediate calculations from going to zero exponentially in the
- sequence length.
-
- Chris Marshall
- marshall@seas.gwu.edu
-