# EIO Redesign

### From Erights

Kevin Reid (Talk | contribs) (infodump) |
(wiki doesn't wrap code, so doing it manually) |
||

Line 94: | Line 94: | ||

/** The Size type should typically support addition, subtraction, and isZero. | /** The Size type should typically support addition, subtraction, and isZero. | ||

* | * | ||

- | * An earlier version of this abstraction was designed by Kevin Reid with help from Joe Geldart and ideas from the Haskell Monoid type class and <http://conal.net/blog/posts/sequences-segments-and-signals/>. | + | * An earlier version of this abstraction was designed by Kevin Reid with help from |

+ | * Joe Geldart and ideas from the Haskell Monoid type class and | ||

+ | * <http://conal.net/blog/posts/sequences-segments-and-signals/>. | ||

*/ | */ | ||

interface "org.erights.e.elib.eio.segment"[Size] { | interface "org.erights.e.elib.eio.segment"[Size] { | ||

- | /** The size of a segment. When possible, this operation should only be applied to segments which are the left side of a d.split(); this permits infinite segments. */ | + | /** The size of a segment. When possible, this operation should only be applied to |

+ | * segments which are the left side of a d.split(); this permits infinite segments. | ||

+ | */ | ||

to size() :Size | to size() :Size | ||

Line 107: | Line 111: | ||

* Law: a.desegment([b]).split(a.size())[0] is equivalent to a. | * Law: a.desegment([b]).split(a.size())[0] is equivalent to a. | ||

* | * | ||

- | * Not a law: d.join(a, b).split(a.size())[1] is equivalent to b. (For example, maps and sets cannot implement this.) | + | * Not a law: d.join(a, b).split(a.size())[1] is equivalent to b. (For example, maps |

+ | * and sets cannot implement this.) | ||

*/ | */ | ||

to desegment(segments :List[Segment]) :Segment | to desegment(segments :List[Segment]) :Segment | ||

- | /** Break this segment up into two segments, where the size of the first is the minimum of the given size and the size of this. | + | /** Break this segment up into two segments, where the size of the first is the minimum of |

+ | * the given size and the size of this. | ||

* | * | ||

* Law: def [a,b] := x.segment(_); a.desegment([b]) is equivalent to x. | * Law: def [a,b] := x.segment(_); a.desegment([b]) is equivalent to x. | ||

* | * | ||

- | * Law: x.segment(someSize - someSize)[0] = identity, such that identity.desegment([x]) is equivalent to x. | + | * Law: x.segment(someSize - someSize)[0] = identity, such that identity.desegment([x]) is |

+ | * equivalent to x. | ||

* | * | ||

- | * Law: If EIO.getALL() is in the range of Size, then a.split(ALL) is equivalent to [a, a.split(zero)[0]]. | + | * Law: If EIO.getALL() is in the range of Size, then a.split(ALL) is equivalent to |

+ | * [a, a.split(zero)[0]]. | ||

*/ | */ | ||

to segment(size :Size) :Tuple[Segment[Size], Segment[Size]] | to segment(size :Size) :Tuple[Segment[Size], Segment[Size]] |

## Revision as of 23:53, 4 February 2011

As the only implementor of EIO, Kevin Reid feels that the original EIO design is too heavyweight; it effectively requires input streams to have buffers and an elaborate queueing mechanism, and does not map well to POSIX IO operations, thus creating inefficiency and discouraging compositional programming.

XXX Explain whch API variant the last kEIO implements.

XXX Explain whch API variant current E-on-CL implements.

## Contents |

## Redesign goals

- A stream should be able to wrap a POSIX file descriptor without hiding too much of the functionality. In particular, it should be possible to implement, for example, a console-IO process, which must not read from stdin until it actually wants user input (or be suspended if it is a background process).
- It should be possible to write interesting and lightweight wrappers around stream objects.
- It should be possible to interconvert between sequences and streams readily.
- Furthermore, given that we can convert a sequence to an input stream, streams solve another design issue: parallel iteration. Currently, there is no way to iterate over multiple collections in parallel without explicit indexing (because iterate/1 is internal iteration); whereas a group of stream objects can easily be read in parallel.
- Doing this introduces another requirement: It should be possible to iterate over a map, and obtain the key-value pairs.

- Furthermore, given that we can convert a sequence to an input stream, streams solve another design issue: parallel iteration. Currently, there is no way to iterate over multiple collections in parallel without explicit indexing (because iterate/1 is internal iteration); whereas a group of stream objects can easily be read in parallel.

## Version 1

XXX I don't even remember what this was

## Version 2

XXX Record what EoCL does here

## Version 3a: Dividoids

In order to solve the problem of interacting with arbitrary chunk types, we replace the chunk type guard with an object which provides the interface for working with chunks (now renamed 'segments'). These objects are named dividoids, which name was suggested by a friend with knowledge of category theory.

The Dividoid interface is designed to be highly generic; for example, it could equally be well used to chop up a function representing a real number value varying continuously, as well as collections of discrete objects. This is deliberate, to benefit the fuzzier cases such as text (which you might want to avoid breaking in the middle of, say, a UTF-8 character, or a combining sequence, or a word...)

Current draft dividoid interface:

/** A dividoid is ...XXX finish this * * The Size type should typically support addition, subtraction, and isZero. * * A dividoid should not be stateful. * * This abstraction was designed by Kevin Reid with help from Joe Geldart and ideas from the Haskell Monoid type class and <http://conal.net/blog/posts/sequences-segments-and-signals/>. */ interface "org.erights.e.elib.eio.dividoid"[Segment, Size] { to segmentType() :Same[Segment] to sizeType() :Same[Size] /** The empty, or identity, segment. * * Law: d.join(d.empty(), a) is equivalent to a. * Law: d.join(a, d.empty()) is equivalent to a. */ to empty() :Segment /** The size of a segment. When possible, this operation should only be applied to segments which are the left side of a d.split(); this permits infinite segments. */ to size(e :Segment) :Size /** Join segments to make one segment. * * Law: d.join is associative. * * Law: d.split(d.join(a, b), d.size(a))[0] is equivalent to a. * * Not a law: d.split(d.join(a, b), d.size(a))[1] is equivalent to b. (For example, maps and sets cannot implement this.) */ to join(segments :List[Segment]) :Segment /** Break this segment up into two segments, where the size of the first is the minimum of the given size and the size of the given segment. * * Law: def [a,b] := d.split(x, _); d.join(a, b) is equivalent to x. * * Law: If EIO.getALL() is in the range of Size, then d.split(a, ALL) is equivalent to [a, d.empty()]. */ to split(e :Segment, size :Size) :Tuple[Segment, Segment] XXX split-off-one-element operation for sanity in parallel iteration? }

XXX Expand on this

Rationales:

- The join operation takes a list of segments so that it can use an efficient accumulation procedure and avoid redundant allocations.

## Version 3b (not yet coded)

The problem with the Dividoid notion is that it is unnatural to E to have an external object providing the operations. The motivation for it was to allow this functionality to be added without adding many methods to the individual objects. Suppose we discard that and add operations to the objects.

Disadvantages:

- possible clutter in the already-large collection interfaces and naming conflicts.
- no natural object from which to get the empty segment

Advantages:

- Better promise pipelining behavior
- Possible utility as collection methods
- Generalize cdr-pattern and its map version's semantics?

Draft interface:

<import:org.cubik.cle.makeAdvisoryInterface>( /** The Size type should typically support addition, subtraction, and isZero. * * An earlier version of this abstraction was designed by Kevin Reid with help from * Joe Geldart and ideas from the Haskell Monoid type class and * <http://conal.net/blog/posts/sequences-segments-and-signals/>. */ interface "org.erights.e.elib.eio.segment"[Size] { /** The size of a segment. When possible, this operation should only be applied to * segments which are the left side of a d.split(); this permits infinite segments. */ to size() :Size /** Append the provided segments to this segment to make one segment. * * Law: desegment is associative among [this]+segments. * * Law: a.desegment([b]).split(a.size())[0] is equivalent to a. * * Not a law: d.join(a, b).split(a.size())[1] is equivalent to b. (For example, maps * and sets cannot implement this.) */ to desegment(segments :List[Segment]) :Segment /** Break this segment up into two segments, where the size of the first is the minimum of * the given size and the size of this. * * Law: def [a,b] := x.segment(_); a.desegment([b]) is equivalent to x. * * Law: x.segment(someSize - someSize)[0] = identity, such that identity.desegment([x]) is * equivalent to x. * * Law: If EIO.getALL() is in the range of Size, then a.split(ALL) is equivalent to * [a, a.split(zero)[0]]. */ to segment(size :Size) :Tuple[Segment[Size], Segment[Size]] /** Return a segment of size zero. * * XXX This is redundant with .split(someSize-someSize) and could be removed. * * This is an identity element. If a is a compatible segment, * Law: this.withoutAll().desegment([a]) is equivalent to a. * Law: this.desegment([a.withoutAll()]) is equivalent to this. */ to withoutAll() :Segment[Size] })

Plans:

- If an ordinary finite collection (List, Map, etc) is coerced to a Segment, have it return an object specialized for being an efficient iterator.
- But wait, then segments aren't ordinary collections. Implementation-switching, or coerce back?

Design TODO:

- Figure out how this supports parallel iteration. We need an analogue to segment(1) => [1-element-collection, rest-of-items], produces [key, value, rest-of-items] (the key and value being what iterate/1 gives its AssocFunc). This could be done by splitting out a 1-element segment and getting key and value from that, or by a method that actually yields that 3-tuple.