add examples of usage
This commit is contained in:
9
.gitignore
vendored
9
.gitignore
vendored
@@ -1,3 +1,6 @@
|
|||||||
/proj
|
/bench
|
||||||
/proj.du
|
/bench.du
|
||||||
/proj.ud
|
/bench.ud
|
||||||
|
/examples
|
||||||
|
/examples.du
|
||||||
|
/examples.ud
|
||||||
|
|||||||
28
README.md
28
README.md
@@ -4,4 +4,32 @@
|
|||||||
|
|
||||||
Standard ML port of [this](https://github.com/hummy123/brolib) rope implementation.
|
Standard ML port of [this](https://github.com/hummy123/brolib) rope implementation.
|
||||||
|
|
||||||
|
This particular rope uses the balancing scheme described in the [Purely Functional 1-2 Brother Trees paper authored by Ralph Hinze](https://www.cs.ox.ac.uk/ralf.hinze/publications/Brother12.pdf). It tries to keep the number of nodes to a minimum by joining the strings in adjacent leaf nodes, if joining would not be too expensive.
|
||||||
|
|
||||||
## Usage
|
## Usage
|
||||||
|
|
||||||
|
The two files are `rope.sml` and `tiny_rope.sml`.
|
||||||
|
|
||||||
|
`rope.sml` contains a rope that tracks line metadata (which has a small performance and memory penalty). This is useful if you have line-based operations in mind.
|
||||||
|
|
||||||
|
`tiny_rope.sml` doesn't track line metadata, and is useful when line-queries aren't needed.
|
||||||
|
|
||||||
|
Except for those line-based operations marked below, all functions are the same between the two.
|
||||||
|
|
||||||
|
### Examples
|
||||||
|
|
||||||
|
#### Initialise
|
||||||
|
|
||||||
|
`val rope = Rope.fromString "hello, world!"`
|
||||||
|
|
||||||
|
It's best to use a string with a length less than or equal to 1024 for performance reasons. (The point of a rope is to represent a large string using a binary tree that contains smaller pieces.)
|
||||||
|
|
||||||
|
#### Convert to string
|
||||||
|
|
||||||
|
`val str = Rope.toString rope`
|
||||||
|
|
||||||
|
This is a function that is better to avoid.
|
||||||
|
|
||||||
|
#### Insert
|
||||||
|
|
||||||
|
`Rope.insert(0, "hello, world!")`
|
||||||
|
|||||||
5
examples.mlb
Normal file
5
examples.mlb
Normal file
@@ -0,0 +1,5 @@
|
|||||||
|
$(SML_LIB)/basis/basis.mlb
|
||||||
|
|
||||||
|
tiny_rope.sml
|
||||||
|
rope.sml
|
||||||
|
examples.sml
|
||||||
139
examples.sml
Normal file
139
examples.sml
Normal file
@@ -0,0 +1,139 @@
|
|||||||
|
(* An empty rope, containing no strings. *)
|
||||||
|
val rope = Rope.empty;
|
||||||
|
|
||||||
|
(* Initialise rope from a string.
|
||||||
|
*
|
||||||
|
* You probably want to avoid initialising the rope with very long strings,
|
||||||
|
* because a rope is meant to represent a long string
|
||||||
|
* by holding nodes that contain smaller strings in a binary tree.
|
||||||
|
* The implementation avoids building strings that are ever larger than 1024,
|
||||||
|
* but that was done in an attempt to find the ideal length for performance.
|
||||||
|
* A user shouldn't notice any delays in larger lengths like 65535 either.
|
||||||
|
*
|
||||||
|
* In their text buffer (a piece-tree, which is slower than a rope),
|
||||||
|
* the VS Code team had other issues with excessively large strings.
|
||||||
|
* https://code.visualstudio.com/blogs/2018/03/23/text-buffer-reimplementation#_avoid-the-string-concatenation-trap *)
|
||||||
|
val rope = Rope.fromString "hello, world!\n";
|
||||||
|
|
||||||
|
(* Convert a rope to a string.
|
||||||
|
*
|
||||||
|
* This may involve allocating an extremely large string in some cases,
|
||||||
|
* which should be avoided for the reason mentioned in the above comment. *)
|
||||||
|
val str = Rope.toString rope;
|
||||||
|
|
||||||
|
(* Insert a string into the rope.
|
||||||
|
*
|
||||||
|
* There isn't any validation to check that you inserted at a reasonable
|
||||||
|
* position.
|
||||||
|
* If you insert at an index lower than 0, your inserted string is just
|
||||||
|
* prepended to the start.
|
||||||
|
* If you insert at an index greater than the length, your inserted string is
|
||||||
|
* just appended to the end.
|
||||||
|
*
|
||||||
|
* One thing to watch out for if you are using the line-rope is making sure
|
||||||
|
* that you don't insert in the middle of a \r\n pair, separating \r from \n.
|
||||||
|
* That would mess up the line metadata the rope contains and make the line
|
||||||
|
* metadata invalid. *)
|
||||||
|
val rope = Rope.insert (14, "goodbye, world!", rope);
|
||||||
|
|
||||||
|
(* Append a string into the rope. *)
|
||||||
|
val rope = Rope.append ("hello again\n", rope);
|
||||||
|
|
||||||
|
(* Append a string into the rope, providing line metadata with it.
|
||||||
|
*
|
||||||
|
* The point of this function is for performance: the other insertion functions
|
||||||
|
* calculate the line metadata by scanning the string itself, but in some cases
|
||||||
|
* this is already known. The larger example below is such a case. *)
|
||||||
|
val rope = Rope.appendLine ("my new line", Vector.fromList [], rope);
|
||||||
|
|
||||||
|
(** Second larger example motivating String.appendLine below. *)
|
||||||
|
(*** Returns the start index of a line,
|
||||||
|
*** returning the index of \r if line ends with a \r\n pair. *)
|
||||||
|
fun getLineStart line =
|
||||||
|
let
|
||||||
|
val lastIdx = String.size line - 1
|
||||||
|
val lastChr = String.sub (line, lastIdx)
|
||||||
|
in
|
||||||
|
if lastChr = #"\n" andalso lastIdx - 1 >= 0 then
|
||||||
|
if String.sub (line, lastIdx - 1) = #"\r" then lastIdx - 1 else lastIdx
|
||||||
|
else
|
||||||
|
lastIdx
|
||||||
|
end;
|
||||||
|
|
||||||
|
(*** Appends the lines in a file to a rope. *)
|
||||||
|
fun readLines (rope, file) =
|
||||||
|
case TextIO.inputLine file of
|
||||||
|
SOME line =>
|
||||||
|
let
|
||||||
|
(* Don't need to scan string to find line breaks,
|
||||||
|
* because we already know. *)
|
||||||
|
val lineIdx = getLineStart (line)
|
||||||
|
val vec = Vector.fromList [lineIdx]
|
||||||
|
val rope = Rope.appendLine (line, vec, rope)
|
||||||
|
in
|
||||||
|
readLines (rope, file)
|
||||||
|
end
|
||||||
|
| NONE => rope;
|
||||||
|
|
||||||
|
val licenseRope = readLines (Rope.empty, TextIO.openIn "LICENSE");
|
||||||
|
|
||||||
|
(* Deletes the given range from rope, from the start index to the end index.
|
||||||
|
*
|
||||||
|
* As with insert, one should make sure they don't corrupt the line metadata.
|
||||||
|
* Specifically, in a \r\n pair, the line metadata points to \r.
|
||||||
|
* Deleting \r would corrupt it, but deleting \n would be fine.
|
||||||
|
* In general, if you want to delete a line break, you would want to delete both
|
||||||
|
* \r and \n. The user thinks of the \r\n pair as a single character so they are
|
||||||
|
* expecting the whole line break to be deleted. *)
|
||||||
|
|
||||||
|
(** Initialise new rope. *)
|
||||||
|
val rope = Rope.fromString "hello, world!";
|
||||||
|
(** New rope contains "hello world!" without comma. *)
|
||||||
|
val rope = Rope.delete (5, 1, rope);
|
||||||
|
|
||||||
|
(* Folds over the characters in a rope, starting from the given index.
|
||||||
|
*
|
||||||
|
* This is meant to be an alternative to queries for a specific line or a
|
||||||
|
* substring.
|
||||||
|
* If a rope is meant to avoid allocating large strings, then it seems more
|
||||||
|
* performant to query its contents through higher-order functions rather than
|
||||||
|
* allocating substrings and querying the substring. *)
|
||||||
|
val rope = Rope.fromString "hello!";;
|
||||||
|
|
||||||
|
fun apply (chr, lst) = chr :: lst;
|
||||||
|
(** val result = [#"!",#"o",#"l",#"l",#"e"] : char list *)
|
||||||
|
val result = Rope.foldFromIdx (apply, 1, rope, []);
|
||||||
|
|
||||||
|
(* Folds over the characters in a rope, accepting a predicate function
|
||||||
|
* that terminates the fold when it returns true. *)
|
||||||
|
fun apply (chr, acc) =
|
||||||
|
(print (Char.toString chr); acc + 1);
|
||||||
|
|
||||||
|
fun term acc = acc = 3;
|
||||||
|
|
||||||
|
(** Below function prints first three letters, "hel",
|
||||||
|
** and then steops folding. *)
|
||||||
|
val _ = Rope.foldFromIdxTerm (apply, term, 0, rope, 0);
|
||||||
|
|
||||||
|
(* Folds over the characters in a rope, starting from the given line number.
|
||||||
|
*
|
||||||
|
* This is just like the foldFromIdxTerm function, except that it starts folding
|
||||||
|
* from the given line number instead. *)
|
||||||
|
val rope = Rope.fromString "hello, world!\ngoodbye, world!\nhello again!";
|
||||||
|
|
||||||
|
fun apply (chr, _) =
|
||||||
|
print (Char.toString chr);
|
||||||
|
|
||||||
|
fun term _ = false;
|
||||||
|
|
||||||
|
(** Below line prints the whole string, one character at a time. *)
|
||||||
|
Rope.foldLines (apply, term, 0, rope, ());
|
||||||
|
(** Prints starting from #"g" in "goodbye". *)
|
||||||
|
Rope.foldLines (apply, term, 1, rope, ());
|
||||||
|
(** Prints the very last line. *)
|
||||||
|
Rope.foldLines (apply, term, 2, rope, ());
|
||||||
|
|
||||||
|
(** Prints the whole string if specifying a line before 0, which doesn't exist. *)
|
||||||
|
Rope.foldLines (apply, term, ~3, rope, ());
|
||||||
|
(** Raises a subscript exception: there is no corresponding line in the rope. *)
|
||||||
|
Rope.foldLines (apply, term, 4, rope, ());
|
||||||
45
rope.sml
45
rope.sml
@@ -5,7 +5,6 @@ sig
|
|||||||
|
|
||||||
val fromString: string -> t
|
val fromString: string -> t
|
||||||
val toString: t -> string
|
val toString: t -> string
|
||||||
val foldr: ('a * string * int vector -> 'a) * 'a * t -> 'a
|
|
||||||
|
|
||||||
(* The caller should not insert in the middle of a \r\n pair,
|
(* The caller should not insert in the middle of a \r\n pair,
|
||||||
* or else line metadata will become invalid. *)
|
* or else line metadata will become invalid. *)
|
||||||
@@ -727,34 +726,60 @@ struct
|
|||||||
val chr = String.sub (str, pos)
|
val chr = String.sub (str, pos)
|
||||||
val acc = apply (chr, acc)
|
val acc = apply (chr, acc)
|
||||||
in
|
in
|
||||||
foldLineCharsTerm (apply, term, pos, str, strSize, acc)
|
foldLineCharsTerm (apply, term, pos + 1, str, strSize, acc)
|
||||||
end
|
end
|
||||||
| true => acc
|
| true => acc
|
||||||
else
|
else
|
||||||
acc
|
acc
|
||||||
|
|
||||||
fun foldLines (apply, term, lineNum, rope, acc) =
|
fun helpFoldLines (apply, term, lineNum, rope, acc) =
|
||||||
case rope of
|
case rope of
|
||||||
N2 (l, _, lmv, r) =>
|
N2 (l, _, lmv, r) =>
|
||||||
if lineNum < lmv then
|
if lineNum < lmv then
|
||||||
let
|
let
|
||||||
val acc = foldLines (apply, term, lineNum, rope, acc)
|
val acc = helpFoldLines (apply, term, lineNum, rope, acc)
|
||||||
in
|
in
|
||||||
if term acc then acc
|
if term acc then acc
|
||||||
else foldLines (apply, term, lineNum - lmv, r, acc)
|
else helpFoldLines (apply, term, lineNum - lmv, r, acc)
|
||||||
end
|
end
|
||||||
else
|
else
|
||||||
foldLines (apply, term, lineNum - lmv, r, acc)
|
helpFoldLines (apply, term, lineNum - lmv, r, acc)
|
||||||
| N1 t => foldLines (apply, term, lineNum, t, acc)
|
| N1 t => helpFoldLines (apply, term, lineNum, t, acc)
|
||||||
| N0 (str, vec) =>
|
| N0 (str, vec) =>
|
||||||
|
(* We have a few edge cases to handle here.
|
||||||
|
* 1. If lineNum is 0 or the vector has no elements,
|
||||||
|
* we should start folding from the start of the string.
|
||||||
|
* 2. Since the vector points to the start of a linebreak
|
||||||
|
* (which means either \r or \n when either is alone,
|
||||||
|
* or \r in a \r\n pair),
|
||||||
|
* we have to skip the linebreak or linebreak pair when folding
|
||||||
|
* over the string. That is more intuitive to the user. *)
|
||||||
|
if lineNum < 0 orelse Vector.length vec = 0 then
|
||||||
|
foldLineCharsTerm (apply, term, 0, str, String.size str, acc)
|
||||||
|
else
|
||||||
let
|
let
|
||||||
val idx =
|
val idx = Vector.sub (vec, lineNum)
|
||||||
if Vector.length vec > 0 then Vector.sub (vec, lineNum) else 0
|
|
||||||
in
|
in
|
||||||
foldLineCharsTerm (apply, term, idx, str, String.size str, acc)
|
if idx + 1 < String.size str then
|
||||||
|
let
|
||||||
|
val chr = String.sub (str, idx)
|
||||||
|
val nextChr = String.sub (str, idx + 1)
|
||||||
|
in
|
||||||
|
if chr = #"\r" andalso nextChr = #"\n" then
|
||||||
|
foldLineCharsTerm
|
||||||
|
(apply, term, idx + 2, str, String.size str, acc)
|
||||||
|
else
|
||||||
|
foldLineCharsTerm
|
||||||
|
(apply, term, idx + 1, str, String.size str, acc)
|
||||||
|
end
|
||||||
|
else
|
||||||
|
acc
|
||||||
end
|
end
|
||||||
| _ => raise AuxConstructor
|
| _ => raise AuxConstructor
|
||||||
|
|
||||||
|
fun foldLines (apply, term, lineNum, rope, acc) =
|
||||||
|
helpFoldLines (apply, term, lineNum - 1, rope, acc)
|
||||||
|
|
||||||
fun verifyLines rope =
|
fun verifyLines rope =
|
||||||
foldr
|
foldr
|
||||||
( (fn (_, str, vec) =>
|
( (fn (_, str, vec) =>
|
||||||
|
|||||||
Reference in New Issue
Block a user