iteratee roll gives incorrect results
jwlato at gmail.com
Wed Mar 16 11:57:58 GMT 2011
Indeed peeking beyond just the stream head is difficult, for exactly the
reason you give. Separating EOF from chunks in stream handling could
resolve this to some extent, but I don't want to force another major change
like this on users just yet.
Other than just reading a big block, it would be relatively simple to
implement peeking within the current chunk. If you're looking a small
number of elements ahead, this may be useful.
On Wed, Mar 16, 2011 at 2:31 AM, Akio Takano <tkn.akio at gmail.com> wrote:
> Thank you for the reply. My usage of roll was indeed as a super-peek,
> but that behavior seems to be hard to implement. I will try to go
> without it.
> I have no idea on what is the best design, but I think any of your
> options (or even just dropping roll from the export list) would be
> better than the current status.
> -- Takano Akio
> 2011/3/15 John Lato <jwlato at gmail.com>:
> > Hi,
> > Thanks very much for reporting this. I suppose I'm not too surprised
> > one of the very few operations in ListLike I don't yet have a test case
> > has a bug. I wasn't ready to include 'roll' in the distribution because
> > wasn't sure it was correct, but accidentally committed it anyway and left
> > it.
> > Regarding the modification to (>>=), I think I would rather avoid it due
> > the increased complexity. Modifying 'Stream' may be right way to fix
> > but it would be a big rewrite. It would affect user code too.
> > There are two other approaches to consider:
> > 1) Change the semantics so that if EOF is encountered, 'roll' will return
> > all elements up to EOF and then propagate EOF. This would treat EOF as a
> > special case and force the caller to deal with it. I don't like the
> > behavioral inconsistency, but it would only be a patch-level change.
> > 2) Make 'roll' an enumeratee, a not-quite-fully-generalized signature
> > be:
> > roll :: Int -> Int -> Iteratee [s] m a -> Iteratee s m (Iteratee [s] m a)
> > The inner iteratee would be feed appropriate chunks, but 'roll' would
> > consume as much of the stream as necessary. The behavior would be
> > like this:
> > iter = do
> > s1 <- I.joinI $ roll 4 2 I.stream2list
> > s2 <- I.stream2list
> > return $ show (s1::[[Int]]) ++ " " ++ show (s2::[Int])
> > main1 = I.enumPureNChunk [1, 2, 3] 2 iter >>= run >>= putStrLn
> >>> [[1,2,3],] 
> > main2 = I.enumPureNChunk [1, 2, 3, 4] 2 iter >>= run >>= putStrLn
> >>> [[1,2,3,4],[3,4]] 
> > main3 = I.enumPureNChunk [1..6] 2 iter >>= run >>= putStrLn
> >>> [[1,2,3,4],[3,4]] [5,6]
> > This is arguably better for producing a sliding-window view of a stream,
> > it would completely rule out using roll as a super-peek.
> > John
> > On Mon, Mar 14, 2011 at 10:04 AM, Akio Takano <tkn.akio at gmail.com>
> >> Hi,
> >> I've found that `roll' from Data.Iteratee.ListLike does not work as
> >> expected. The following program demonstrates the problem. it prints
> >> "[[1,2]] [2,3]" while it should print "[[1,2,3]] [2,3]".
> >> import qualified Data.Iteratee.ListLike as I
> >> import Data.Iteratee.Iteratee
> >> main = I.enumPureNChunk [1, 2, 3] 2 iter >>= run >>= putStrLn
> >> where
> >> iter = do
> >> xs <- I.roll 4 1
> >> ys <- I.stream2list
> >> return $ show (xs::[[Int]]) ++ " " ++ show (ys::[Int])
> >> I tried to fix it, but it turned out to be not easy. I try to explain
> >> the situation:
> >> 1. When an iteratee moves to the Done state, it must return the
> >> unconsumed part of the stream. Perhaps this means it must return a
> >> stream that is a subsequence of the given stream. Though I'm not sure
> >> how strictly this rule is enforced, at least (>>=) expects its LHS to
> >> return EOF when it sends EOF to the LHS. Otherwise the EOF in the
> >> stream gets lost and the resulting iteratee of (>>=) does not
> >> terminate on EOF.
> >> 2. An iteratee can return some elements or an EOF marker as the
> >> unconsumed part, but not both at the same time.
> >> Suppose you are evaluating (enumPure1Chunk  (roll 2 0) >>= run).
> >> The iteratee first receives (Chunk ) and continues because it has
> >> not seen enough number of stream elements to return a result. Next it
> >> receives (EOF Nothing). It should move to the Done state now, but it's
> >> not clear what stream it should return. If it returns (Chunk ) it
> >> violates "1." above and will not be able to be used with (>>=). If it
> >> returns (EOF Nothing), that means it consumed the whole stream, which
> >> is an incorrect result because roll is called with d=0.
> >> In general there seems to be no way for an iteratee to safely look
> >> ahead to an element beyond the next one, without consuming any
> >> elements.
> >> I can think of two options to solve or work around the problem.
> >> - Modify (>>=) so that it will allow its LHS to return some elements
> >> on EOF. In that case it will explicitly feed EOF to its RHS. I
> >> attached code showing the approach. I'm not sure this is a correct
> >> thing, especially regarding the smantics and the performance of (>>=).
> >> Also I haven't confirmed that (>>=) is the only function that depends
> >> on the EOF->EOF assumption.
> >> - Change the definition of Stream so that they can contain some
> >> elements and an EOF marker at the same time. Then an iteratee will be
> >> able to "put back" any number of elements onto the stream. This will
> >> require a global rewrite of the library.
> >> Any ideas?
> >> Takano Akio
> >> _______________________________________________
> >> Iteratee mailing list
> >> Iteratee at projects.haskell.org
> >> http://projects.haskell.org/cgi-bin/mailman/listinfo/iteratee
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the Iteratee