|
|
4dfee016eb
|
handle edge case in char-range: in a range like a-z, the second character may be an escape sequence, and we need to handle that case if so
|
2025-10-07 12:13:41 +01:00 |
|
|
|
44c2fbb3c7
|
handle character ranges like a-z in character class and negated character class
|
2025-10-07 09:48:10 +01:00 |
|
|
|
3d4dbdda69
|
done adding functionality for parsing character classes
|
2025-10-07 09:31:24 +01:00 |
|
|
|
8eed2ef51a
|
add support for [^negated_character_classes], although we don't parse them yet
|
2025-10-07 08:57:01 +01:00 |
|
|
|
d6142285da
|
add handling for [character class] type (but note that we don't parse a character class yet)
|
2025-10-07 08:51:46 +01:00 |
|
|
|
56658a4a70
|
only convert char to int in dfa-gen.sml's 'convertChar' loop
|
2025-10-07 08:43:15 +01:00 |
|
|
|
71786a494c
|
fix minor bug with escape sequences: we should pattern match on an unescaped char, and we should return an escaped char. For example, it makes sense to pattern match on plain unescaped /home/humza/Downloads/sml/shf/temp.txt"n" and return /home/humza/Downloads/sml/shf/temp.txt"\n". This is because user inputs escape-chars as a two-char sequence, prepended by a backslash \ character
|
2025-10-06 21:58:50 +01:00 |
|
|
|
cc5c0bf95c
|
implement escape sequences for regex
|
2025-10-06 21:44:57 +01:00 |
|
|
|
0bcb3e1dfc
|
done implementing '+' and '?' regex operators
|
2025-10-06 20:41:52 +01:00 |
|
|
|
dcd930f855
|
begin expanding ? and + regex symbols, which we can represent using a combination of the others
|
2025-10-06 17:08:57 +01:00 |
|
|
|
9dd44c8eca
|
fix implementation of ZERO_OR_MORE (Kleene star) in dfa-gen.sml
|
2025-10-06 14:45:28 +01:00 |
|
|
|
2779b61c1f
|
amendment to 'lastpos' function: if right child is not nullable, then get lastpos of right child, or else get union of them both
|
2025-10-06 13:50:34 +01:00 |
|
|
|
e05c690548
|
fix bug with implementation of wildcard: we don't want to match a wildcard if the character we are getting follow-positions for has an ASCII code of 0, because we are using that as an endmarker
|
2025-10-06 12:12:23 +01:00 |
|
|
|
cca2602429
|
fix bug in implementation of DFA algorithm: we need to add an end marker, and this will be used to tell us whether we have reached the final state in the DFA
|
2025-10-06 11:49:10 +01:00 |
|
|
|
3f30d49420
|
progress using dfa for searching
|
2025-10-06 09:55:05 +01:00 |
|
|
|
626aa0a860
|
add utility functions for using generated dfa
|
2025-10-06 09:06:04 +01:00 |
|
|
|
f554c0db29
|
change 'dtran' set to only contain integers indicating the index from dstates to transition to on char
|
2025-10-06 08:21:04 +01:00 |
|
|
|
a3287e71b9
|
take care of todo note addressing efficiency: don't update dtran vector on each 'convertChar' loop, but accumulate set and then append set to end of dtran at end of 'convertChar' loop
|
2025-10-06 08:11:30 +01:00 |
|
|
|
6ae38189cf
|
previously, dtran was a {states: int list, transitions: set} record, but because the states are the exact same as the information in dstates (at same position too), we changed dtran to contain only the transitions
|
2025-10-06 07:53:05 +01:00 |
|
|
|
c995d3cdf7
|
if we encounter an empty state when getting follow positions, skip to next char
|
2025-10-06 07:44:46 +01:00 |
|
|
|
303bcdf23d
|
fix type errors
|
2025-10-05 20:27:48 +01:00 |
|
|
|
988ef22e75
|
first pass implementing 'convertChar' function
|
2025-10-05 20:19:26 +01:00 |
|
|
|
ecdf642f13
|
progress with 'get-follow-positions-of-each-char' loop
|
2025-10-05 15:31:11 +01:00 |
|
|
|
01fed05c87
|
remove functions which will soon be dead code, and cause code which uses them to be stubbed out
|
2025-10-05 14:45:36 +01:00 |
|
|
|
d3795c771a
|
implement a function which descends down to a particular position, and then computes followpos: there were previously two separate functions performing these two tasks
|
2025-10-05 12:04:20 +01:00 |
|
|
|
7e2021be24
|
tiny changes to dfa-gen.sml to make it more presentable when asking for advice
|
2025-10-03 07:29:28 +01:00 |
|