Commit Graph

588 Commits

Author SHA1 Message Date
250ae239be begin adding tests for regex 2025-10-09 05:34:32 +01:00
70215fbc0a done implementing functionality to get PersistentVector.prevMatch working. We use the call stack to try the node at the previous index if we receive an invalid state from the recursive call. 2025-10-08 11:10:06 +01:00
088c5c3d98 checkpoint while implementing prevMatch functionality 2025-10-08 10:39:49 +01:00
0de7a9278a progress implementing help-prev-match for vector 2025-10-08 10:27:19 +01:00
3b823d7ae6 delete 'nextMatch' function in search-list.sml, and refactor other code to use alternative function 2025-10-08 08:16:20 +01:00
8941ce9f89 reimplement functionality to search forwards using 'n' command 2025-10-08 08:10:51 +01:00
108e021fdb log an exception if search-thread encounters a failure 2025-10-08 06:51:52 +01:00
3c2e5812cd reimplement function to search through text from scratch 2025-10-08 06:35:49 +01:00
5c8e74ac11 change type of SEARCH message to take a DFA, instead of a searchString 2025-10-08 05:54:19 +01:00
06106f5de8 remove 'searchString' field from app_type, because the same role is fulfilled by new 'dfa' field 2025-10-08 05:40:29 +01:00
df346d0a9e add ability to switch to case-sensitive-search-mode using '?' command from normal mode 2025-10-08 05:29:05 +01:00
8857f49537 pass DFA to 'SearchList.buildRange' function, so that we don't need to parse search string into DFA each time 2025-10-08 05:20:33 +01:00
fd8385fa81 add dfa field to app_type so that we don't rebuild DFA each time we want to execute a search again (like after deleting) 2025-10-08 05:02:15 +01:00
7f68084398 add 'caseSensitive' field to NORMAL_SEARCH_MODE, so that we know what kind of DFA to build 2025-10-08 04:53:04 +01:00
7a72bc2ed1 done with allowing different types of endMarkers 2025-10-07 14:44:40 +01:00
f085860f20 begin making changes to return a parse error if regex string contains an end marker 2025-10-07 14:36:35 +01:00
060df2745a fix bugs: only wildcard and character-class-negation should check to see if curChr is an endmarker 2025-10-07 14:30:23 +01:00
c62e234d00 change dfa-gen to a functor, and use functor to instantiate different structures 2025-10-07 14:05:45 +01:00
075fec02be handle edge case in char range: escaped char followed by another escaped char 2025-10-07 12:28:14 +01:00
4dfee016eb handle edge case in char-range: in a range like a-z, the second character may be an escape sequence, and we need to handle that case if so 2025-10-07 12:13:41 +01:00
44c2fbb3c7 handle character ranges like a-z in character class and negated character class 2025-10-07 09:48:10 +01:00
3d4dbdda69 done adding functionality for parsing character classes 2025-10-07 09:31:24 +01:00
8eed2ef51a add support for [^negated_character_classes], although we don't parse them yet 2025-10-07 08:57:01 +01:00
d6142285da add handling for [character class] type (but note that we don't parse a character class yet) 2025-10-07 08:51:46 +01:00
56658a4a70 only convert char to int in dfa-gen.sml's 'convertChar' loop 2025-10-07 08:43:15 +01:00
71786a494c fix minor bug with escape sequences: we should pattern match on an unescaped char, and we should return an escaped char. For example, it makes sense to pattern match on plain unescaped /home/humza/Downloads/sml/shf/temp.txt"n" and return /home/humza/Downloads/sml/shf/temp.txt"\n". This is because user inputs escape-chars as a two-char sequence, prepended by a backslash \ character 2025-10-06 21:58:50 +01:00
cc5c0bf95c implement escape sequences for regex 2025-10-06 21:44:57 +01:00
0bcb3e1dfc done implementing '+' and '?' regex operators 2025-10-06 20:41:52 +01:00
dcd930f855 begin expanding ? and + regex symbols, which we can represent using a combination of the others 2025-10-06 17:08:57 +01:00
9dd44c8eca fix implementation of ZERO_OR_MORE (Kleene star) in dfa-gen.sml 2025-10-06 14:45:28 +01:00
2779b61c1f amendment to 'lastpos' function: if right child is not nullable, then get lastpos of right child, or else get union of them both 2025-10-06 13:50:34 +01:00
e05c690548 fix bug with implementation of wildcard: we don't want to match a wildcard if the character we are getting follow-positions for has an ASCII code of 0, because we are using that as an endmarker 2025-10-06 12:12:23 +01:00
ea01f1689c fix bug in search-list.sml: when we find a match, we should start 1 idx after the end position of the match 2025-10-06 11:58:03 +01:00
cca2602429 fix bug in implementation of DFA algorithm: we need to add an end marker, and this will be used to tell us whether we have reached the final state in the DFA 2025-10-06 11:49:10 +01:00
3f30d49420 progress using dfa for searching 2025-10-06 09:55:05 +01:00
626aa0a860 add utility functions for using generated dfa 2025-10-06 09:06:04 +01:00
f554c0db29 change 'dtran' set to only contain integers indicating the index from dstates to transition to on char 2025-10-06 08:21:04 +01:00
a3287e71b9 take care of todo note addressing efficiency: don't update dtran vector on each 'convertChar' loop, but accumulate set and then append set to end of dtran at end of 'convertChar' loop 2025-10-06 08:11:30 +01:00
6ae38189cf previously, dtran was a {states: int list, transitions: set} record, but because the states are the exact same as the information in dstates (at same position too), we changed dtran to contain only the transitions 2025-10-06 07:53:05 +01:00
c995d3cdf7 if we encounter an empty state when getting follow positions, skip to next char 2025-10-06 07:44:46 +01:00
303bcdf23d fix type errors 2025-10-05 20:27:48 +01:00
988ef22e75 first pass implementing 'convertChar' function 2025-10-05 20:19:26 +01:00
ecdf642f13 progress with 'get-follow-positions-of-each-char' loop 2025-10-05 15:31:11 +01:00
01fed05c87 remove functions which will soon be dead code, and cause code which uses them to be stubbed out 2025-10-05 14:45:36 +01:00
d3795c771a implement a function which descends down to a particular position, and then computes followpos: there were previously two separate functions performing these two tasks 2025-10-05 12:04:20 +01:00
7e2021be24 tiny changes to dfa-gen.sml to make it more presentable when asking for advice 2025-10-03 07:29:28 +01:00
0696d7ed52 make 'dstates' in 'Nfa.ToDfa.convert' function a vector, rather than a list, and make sure we append to the end each time we add 2025-10-03 05:54:48 +01:00
2de40a09c7 add code to get all transitions in DFA 2025-10-03 05:17:13 +01:00
ff80db1176 progress implementing conversion of regex to a DFA 2025-10-02 13:54:59 +01:00
1c107b0d72 add function to get path to a particular position, for the sake of finding followPpos of a particular node 2025-10-02 05:00:23 +01:00