NetNews Usenet Archive 1992 #27

home *** CD-ROM | disk | FTP | other *** search

/ NetNews Usenet Archive 1992 #27 / NN_1992_27.iso / spool / comp / compiler / 1894 < prev next >

Wrap

Internet Message Format | 1992-11-17 | 7.7 KB

Path: sparky!uunet!ogicse!das-news.harvard.edu!spdcc!iecc!compilers-sender From: pab@cs.arizona.edu (Peter A. Bigot) Newsgroups: comp.compilers Subject: Re: Theoretical CFG question Keywords: parse Message-ID: <92-11-085@comp.compilers> Date: 16 Nov 92 20:00:27 GMT Article-I.D.: comp.92-11-085 References: <92-11-067@comp.compilers> Sender: compilers-sender@iecc.cambridge.ma.us Reply-To: pab@cs.arizona.edu (Peter A. Bigot) Organization: U of Arizona CS Dept, Tucson Lines: 167 Approved: compilers@iecc.cambridge.ma.us norlin@essex.ecn.uoknor.edu (Norman Lin) writes: >The problem was: > * >For i>=1, let b(i) denote the string in 1(0+1) that is the >binary representation of i. > >Construct a CFG generating > + > {0,1,#} - {b(1)#b(2)# ... #b(n) | n>=1} I'll refrain from specific comments about a professor who isn't interested in addressing questions about problems nobody in a class can solve (did you try asking hir in office hours?). This is problem 4.3 in Hopcroft and Ullman. Note that although the complement of the language is not a CFL (pump on it), CFL's aren't closed under complement; they are closed under union, though, so you can simply define CFGs for each way a string could fail to be in the form b_1#b_2#...#b_n. I've attached my own solution to this from a couple years back (this is ugly TeX before I switched to cleaner LaTeX; the idea should be present). The conclusion of the class (and the professor), however, was that the problem was way too hard; and if the even numbered b_i were given in reverse, the resulting grammar would be feasible (easier to generate incorrect mirror images). As noted, the actual grammar below is not correct. \def\re#1{ \ifmmode{\rm\bf #1} \else{\bf #1} \fi} \proofsect {Method}: The idea is to split $L$ into several languages, each of which can be constructed using a context free grammar, then merging the grammars to form one for $L$. {\advance\leftskip by 0.25in \item{1} Strings which do not begin with $b_1$ are not in the language.\par \item{2} Strings in which one of the purported $b_i$ does not begin with a $\re1$ are not in the language.\par \item{3} Strings in which some sequence purported to be $b_i\#b_{i+1}$ has an error are not in the language. I assume that $b_i$ is well formed, and the error is in $b_{i+1}$; without this assumption, correctness is harder to ensure (variations in $b_i$ may make the supposedly ill-formed $b_{i+1}$ correct). This set can be further subdivided:\par {\advance\leftskip by 0.25in \item{3a} If we assume that $b_i = \re1^n$, then $b_{i+1}$ should be $\re1\re0^n$. The error is either in the leading \re1 or in the sequence of $n$ \re0s of $b_{i+1}$.\par \item{3b}. Otherwise, $b_i$ is of the form $\rho01^n$ where $n \ge 0$. $b_{i+1}$ should then be $\rho10^n$. The error can occur in four places: extra characters before the prefix, or an incorrect character match in the prefix, the \re1 immediately following the prefix, or the trailing \re0s.\par }} \hang A grammar which was derived from this partitioning is attached. Nonterminals are enclosed in angle brackets, and productions can be combined using a vertical bar when the nonterminal being rewritten is the same. (Although this grammar does not seem to generate strings which are not in $L$, there are strings of $L$ which it will not generate. Right idea; wrong details---this really is a horrible problem.) // Solution to problem 4.3, Hopcroft & Ullman // 01# // symbols in CFG <S> // External nonterminals <S> // Start symbol <S> -> <L1> // $w where w != (1 or 1#) | <L2> // #w where w [1] != 1 | <H><L3><T> // there is a bi#bj where i+1 <> j in the mix // Odds and ends--sequences of ones, digits, numbers, anything <O> -> 1<O> | //epsilon <D> -> 1 | 0 <N> -> <D><N> | //epsilon <Q> -> 0 | 1 | # <W> -> <Q><W> | //epsilon // Sequence starts with something other than b1 <L1> -> 0<W> | #<W> | 1<D><W> // There is an ill-formed integer somewhere in the mix <L2> -> <W>#0<W> | <W>##<W> // Head and tail of string: ensure that <H>w<T> makes w a bi#bj sequence. <H> -> <W># | //epsilon <T> -> #<W> | //epsilon <L3> -> <La> // Inverse of 1^n#10^n, intersect 1*#<W> | <Lb> // Inverse of p01^n#p10^n <La> -> 1<La>0 // 1s and 0s still balance | 1<La_F>1 // Bad 1 on right | #0<N> // Error--0 starts right | #1<D><N> // Too much on right | <O># // Too many 1s on left <La_F> -> 1<La_F><Q> // Maintain balance of 1s/0s | #<N> // Tack on whatever you want on right // Inverse of p01^n#p10^n // Assume error in prefix: 0 on left, 1 on right <P-L-A0> -> <D><P-L-A0><D> | 0 // Error is center 0; generate prefix around it <P-L-B0> -> <D><P-L-A0>0 // Generate 0 following prefix <P-L-C0> -> <D><P-L-C0>1 | <P-L-B0> // Generate trailing 1s in left side <P-L-D0> -> <D><P-L-C0>#<N> // Generate separating hash <P-L-E0> -> <D><P-L-E0><D> | <P-L-D0> // Generate upper end of right prefix <P-L-A1> -> <D><P-L-A1><D> | 1 // Error is center 1; generate prefix around it <P-L-B1> -> <D><P-L-A1>0 // Generate 0 following prefix <P-L-C1> -> <D><P-L-C1>1 | <P-L-B1> // Generate trailing 1s in left side <P-L-D1> -> <D><P-L-C1>#<N> // Generate separating hash <P-L-E1> -> <D><P-L-E1><D> | <P-L-D1> // Generate upper end of right prefix <P-R-A1> -> <D><P-R-A1><D> | 1 // Error is center 1; generate prefix around it <P-R-B1> -> <N>#<P-R-A1><D> // Separating hash <P-R-C1> -> 1<P-R-C1><D> | <P-R-B1> // Tail 1s of left side <P-R-D1> -> 0<P-R-C1><D> // After prefix 0 on left side <P-R-E1> -> <D><P-R-E1><D> | <P-R-D1> // Right edge of left prefix <P-R-A0> -> <D><P-R-A0><D> | 0 // Error is center 0; generate prefix around it <P-R-B0> -> <N>#<P-R-A0><D> // Separating hash <P-R-C0> -> 1<P-R-C0><D> | <P-R-B0> // Tail 1s of left side <P-R-D0> -> 0<P-R-C0><D> // After prefix 0 on left side <P-R-E0> -> <D><P-R-E0><D> | <P-R-D0> // Right edge of left prefix // Assume error at first zero after prefix: pairs with 0 on right <C-L-A0> -> <D><C-L-A0>1 | 0 // 0 after prefix <C-L-B0> -> <D><C-L-A0># // Trailing 1s plus hash <C-L-C0> -> <D><C-L-C0><D> | <C-L-B0> // Part of right prefix <C-R-Z0> -> <D><C-R-Z0>0 | 0 // Make "1" after prefix a 0 <C-R-Y0> -> #<C-R-Z0>0 // Separating hash <C-R-X0> -> 1<C-R-X0>0 | <C-R-Y0> // Last 1s of left side // Assume error in trailing 1s on left side: pairs with 1 on right <T-A> -> 1<T-A><D> // Last 1s of left tail, last Xs of right tail | #<N>1 // Sep, prefix, 0, 1s, and error <Lb> -> <P-L-A0><P-R-E1> // Break before end of left prefix | <P-L-C0><P-R-C1> // Break somewhere in trailing 1s on left | <P-L-E0><P-R-A1> // Break somewhere in left of right prefix | <P-L-A1><P-R-E0> // Break before end of left prefix | <P-L-C1><P-R-C0> // Break somewhere in trailing 1s on left | <P-L-E1><P-R-A0> // Break somewhere in left of right prefix // Error in first zero after left side prefix | <C-L-A0><C-R-X0> // Break on left side | <C-L-C0><C-R-Z0> // Break on right side // Error in trailing 1s of left side after prefix | <N>1<T-A> -- Peter A. Bigot -- pab@cs.arizona.edu Dept. of Computer Science, University of Arizona, Tucson AZ -- Send compilers articles to compilers@iecc.cambridge.ma.us or {ima | spdcc | world}!iecc!compilers. Meta-mail to compilers-request.