add examples of usage

This commit is contained in:
2024-03-24 12:50:57 +00:00
parent c12aaea8c2
commit e2b1d2c58c
6 changed files with 216 additions and 16 deletions

9
.gitignore vendored
View File

@@ -1,3 +1,6 @@
/proj
/proj.du
/proj.ud
/bench
/bench.du
/bench.ud
/examples
/examples.du
/examples.ud

View File

@@ -4,4 +4,32 @@
Standard ML port of [this](https://github.com/hummy123/brolib) rope implementation.
This particular rope uses the balancing scheme described in the [Purely Functional 1-2 Brother Trees paper authored by Ralph Hinze](https://www.cs.ox.ac.uk/ralf.hinze/publications/Brother12.pdf). It tries to keep the number of nodes to a minimum by joining the strings in adjacent leaf nodes, if joining would not be too expensive.
## Usage
The two files are `rope.sml` and `tiny_rope.sml`.
`rope.sml` contains a rope that tracks line metadata (which has a small performance and memory penalty). This is useful if you have line-based operations in mind.
`tiny_rope.sml` doesn't track line metadata, and is useful when line-queries aren't needed.
Except for those line-based operations marked below, all functions are the same between the two.
### Examples
#### Initialise
`val rope = Rope.fromString "hello, world!"`
It's best to use a string with a length less than or equal to 1024 for performance reasons. (The point of a rope is to represent a large string using a binary tree that contains smaller pieces.)
#### Convert to string
`val str = Rope.toString rope`
This is a function that is better to avoid.
#### Insert
`Rope.insert(0, "hello, world!")`

View File

5
examples.mlb Normal file
View File

@@ -0,0 +1,5 @@
$(SML_LIB)/basis/basis.mlb
tiny_rope.sml
rope.sml
examples.sml

139
examples.sml Normal file
View File

@@ -0,0 +1,139 @@
(* An empty rope, containing no strings. *)
val rope = Rope.empty;
(* Initialise rope from a string.
*
* You probably want to avoid initialising the rope with very long strings,
* because a rope is meant to represent a long string
* by holding nodes that contain smaller strings in a binary tree.
* The implementation avoids building strings that are ever larger than 1024,
* but that was done in an attempt to find the ideal length for performance.
* A user shouldn't notice any delays in larger lengths like 65535 either.
*
* In their text buffer (a piece-tree, which is slower than a rope),
* the VS Code team had other issues with excessively large strings.
* https://code.visualstudio.com/blogs/2018/03/23/text-buffer-reimplementation#_avoid-the-string-concatenation-trap *)
val rope = Rope.fromString "hello, world!\n";
(* Convert a rope to a string.
*
* This may involve allocating an extremely large string in some cases,
* which should be avoided for the reason mentioned in the above comment. *)
val str = Rope.toString rope;
(* Insert a string into the rope.
*
* There isn't any validation to check that you inserted at a reasonable
* position.
* If you insert at an index lower than 0, your inserted string is just
* prepended to the start.
* If you insert at an index greater than the length, your inserted string is
* just appended to the end.
*
* One thing to watch out for if you are using the line-rope is making sure
* that you don't insert in the middle of a \r\n pair, separating \r from \n.
* That would mess up the line metadata the rope contains and make the line
* metadata invalid. *)
val rope = Rope.insert (14, "goodbye, world!", rope);
(* Append a string into the rope. *)
val rope = Rope.append ("hello again\n", rope);
(* Append a string into the rope, providing line metadata with it.
*
* The point of this function is for performance: the other insertion functions
* calculate the line metadata by scanning the string itself, but in some cases
* this is already known. The larger example below is such a case. *)
val rope = Rope.appendLine ("my new line", Vector.fromList [], rope);
(** Second larger example motivating String.appendLine below. *)
(*** Returns the start index of a line,
*** returning the index of \r if line ends with a \r\n pair. *)
fun getLineStart line =
let
val lastIdx = String.size line - 1
val lastChr = String.sub (line, lastIdx)
in
if lastChr = #"\n" andalso lastIdx - 1 >= 0 then
if String.sub (line, lastIdx - 1) = #"\r" then lastIdx - 1 else lastIdx
else
lastIdx
end;
(*** Appends the lines in a file to a rope. *)
fun readLines (rope, file) =
case TextIO.inputLine file of
SOME line =>
let
(* Don't need to scan string to find line breaks,
* because we already know. *)
val lineIdx = getLineStart (line)
val vec = Vector.fromList [lineIdx]
val rope = Rope.appendLine (line, vec, rope)
in
readLines (rope, file)
end
| NONE => rope;
val licenseRope = readLines (Rope.empty, TextIO.openIn "LICENSE");
(* Deletes the given range from rope, from the start index to the end index.
*
* As with insert, one should make sure they don't corrupt the line metadata.
* Specifically, in a \r\n pair, the line metadata points to \r.
* Deleting \r would corrupt it, but deleting \n would be fine.
* In general, if you want to delete a line break, you would want to delete both
* \r and \n. The user thinks of the \r\n pair as a single character so they are
* expecting the whole line break to be deleted. *)
(** Initialise new rope. *)
val rope = Rope.fromString "hello, world!";
(** New rope contains "hello world!" without comma. *)
val rope = Rope.delete (5, 1, rope);
(* Folds over the characters in a rope, starting from the given index.
*
* This is meant to be an alternative to queries for a specific line or a
* substring.
* If a rope is meant to avoid allocating large strings, then it seems more
* performant to query its contents through higher-order functions rather than
* allocating substrings and querying the substring. *)
val rope = Rope.fromString "hello!";;
fun apply (chr, lst) = chr :: lst;
(** val result = [#"!",#"o",#"l",#"l",#"e"] : char list *)
val result = Rope.foldFromIdx (apply, 1, rope, []);
(* Folds over the characters in a rope, accepting a predicate function
* that terminates the fold when it returns true. *)
fun apply (chr, acc) =
(print (Char.toString chr); acc + 1);
fun term acc = acc = 3;
(** Below function prints first three letters, "hel",
** and then steops folding. *)
val _ = Rope.foldFromIdxTerm (apply, term, 0, rope, 0);
(* Folds over the characters in a rope, starting from the given line number.
*
* This is just like the foldFromIdxTerm function, except that it starts folding
* from the given line number instead. *)
val rope = Rope.fromString "hello, world!\ngoodbye, world!\nhello again!";
fun apply (chr, _) =
print (Char.toString chr);
fun term _ = false;
(** Below line prints the whole string, one character at a time. *)
Rope.foldLines (apply, term, 0, rope, ());
(** Prints starting from #"g" in "goodbye". *)
Rope.foldLines (apply, term, 1, rope, ());
(** Prints the very last line. *)
Rope.foldLines (apply, term, 2, rope, ());
(** Prints the whole string if specifying a line before 0, which doesn't exist. *)
Rope.foldLines (apply, term, ~3, rope, ());
(** Raises a subscript exception: there is no corresponding line in the rope. *)
Rope.foldLines (apply, term, 4, rope, ());

View File

@@ -5,7 +5,6 @@ sig
val fromString: string -> t
val toString: t -> string
val foldr: ('a * string * int vector -> 'a) * 'a * t -> 'a
(* The caller should not insert in the middle of a \r\n pair,
* or else line metadata will become invalid. *)
@@ -727,34 +726,60 @@ struct
val chr = String.sub (str, pos)
val acc = apply (chr, acc)
in
foldLineCharsTerm (apply, term, pos, str, strSize, acc)
foldLineCharsTerm (apply, term, pos + 1, str, strSize, acc)
end
| true => acc
else
acc
fun foldLines (apply, term, lineNum, rope, acc) =
fun helpFoldLines (apply, term, lineNum, rope, acc) =
case rope of
N2 (l, _, lmv, r) =>
if lineNum < lmv then
let
val acc = foldLines (apply, term, lineNum, rope, acc)
val acc = helpFoldLines (apply, term, lineNum, rope, acc)
in
if term acc then acc
else foldLines (apply, term, lineNum - lmv, r, acc)
else helpFoldLines (apply, term, lineNum - lmv, r, acc)
end
else
foldLines (apply, term, lineNum - lmv, r, acc)
| N1 t => foldLines (apply, term, lineNum, t, acc)
helpFoldLines (apply, term, lineNum - lmv, r, acc)
| N1 t => helpFoldLines (apply, term, lineNum, t, acc)
| N0 (str, vec) =>
let
val idx =
if Vector.length vec > 0 then Vector.sub (vec, lineNum) else 0
in
foldLineCharsTerm (apply, term, idx, str, String.size str, acc)
end
(* We have a few edge cases to handle here.
* 1. If lineNum is 0 or the vector has no elements,
* we should start folding from the start of the string.
* 2. Since the vector points to the start of a linebreak
* (which means either \r or \n when either is alone,
* or \r in a \r\n pair),
* we have to skip the linebreak or linebreak pair when folding
* over the string. That is more intuitive to the user. *)
if lineNum < 0 orelse Vector.length vec = 0 then
foldLineCharsTerm (apply, term, 0, str, String.size str, acc)
else
let
val idx = Vector.sub (vec, lineNum)
in
if idx + 1 < String.size str then
let
val chr = String.sub (str, idx)
val nextChr = String.sub (str, idx + 1)
in
if chr = #"\r" andalso nextChr = #"\n" then
foldLineCharsTerm
(apply, term, idx + 2, str, String.size str, acc)
else
foldLineCharsTerm
(apply, term, idx + 1, str, String.size str, acc)
end
else
acc
end
| _ => raise AuxConstructor
fun foldLines (apply, term, lineNum, rope, acc) =
helpFoldLines (apply, term, lineNum - 1, rope, acc)
fun verifyLines rope =
foldr
( (fn (_, str, vec) =>