Commit Graph

102 Commits

Author SHA1 Message Date
9d46ec9f34 add additional test for 'dn' motion after finding bug, fixed bug (rewrote high-level delete fundtion in persistent-vector.sml to address it), and begin adding tests for 'dN' motion 2026-03-28 00:45:08 +00:00
756f44e7f8 add tests for 'd$' motion, and fix bug. When we are extending an existing match and reached the last index of the buffer, we used to return the searchList right away. However, we are meant to add the extended match before returning the searchList. We do this now, and the bug is fixed. 2026-02-10 09:56:35 +00:00
33866533a3 address remainning todo-notes, which had to do with updating the searchList when we insert into a buffer. 2026-02-08 03:17:19 +00:00
c28ae4d8cd code function that can insert into both searchList and buffer 2026-02-08 02:32:32 +00:00
340e52019f handle edge case when deleting from buffer: if the previous match is extended into a new match, then replace the old match in the search list with the extended match 2026-02-07 02:25:45 +00:00
b02b2f53da when deleting from buffer and search list both, don't try to find any matches if the DFA is empty (has a length of 0), because that means there is no search to find a match for 2026-02-06 21:25:44 +00:00
02086e0922 code outline of a function to extend an existing match in search-list.sml. 2026-02-06 20:30:07 +00:00
df7669b065 progress in changing functions to use 'PersistentVector.delete' so that search list is incremental and not rebuilt from scratch after each deletion 2026-02-06 08:52:11 +00:00
c6dee6e9f9 implement function that deletes from both LineGap.t and SearchList, maintaining an exact match between both 2026-01-18 09:59:00 +00:00
111e0cf66d remove usage of concurrent ml, deciding that we prefer to run everything in the main thread instead 2025-10-17 23:08:16 +01:00
22a8b807d2 handle edge case when building dfa from a string, where an exception was raised when our search regex contains an alternation where the second alternation is a substring of the first alternation, and add a test for it to make sure that it passes 2025-10-14 02:24:45 +01:00
ca2c2f438c when adding to followset in ONE_OR_MORE case, make sure we add the child to the followset as well 2025-10-12 00:32:55 +01:00
ce3470e612 fix bug in regex-test: dfa-gen.sml should add the position of the endMarker to the followSet as well 2025-10-12 00:22:14 +01:00
7f1f1f7bdc at end of char loop, track if length of dstate changed. If it did not, that means that we have encountered a loop that is at the end; thus, we should add the endMarker 2025-10-11 13:39:28 +01:00
b2931753d0 make dfa-gen.sml compile again, with parity before reimplementing it 2025-10-11 13:23:44 +01:00
96f0afc2b2 attempt at fixing dfa-gen to convert properly 2025-10-11 11:32:30 +01:00
a44afca40b checkpoint for reimplementing dfa-gen.sml 2025-10-10 11:54:34 +01:00
5a43954aef checkpoint 2025-10-10 04:59:32 +01:00
244d0ce26d begin attempt to compute followpos properly 2025-10-10 04:44:18 +01:00
bdfca17b5a implement function to insert a list to a pos 2025-10-10 04:00:34 +01:00
58c3e65fdd add list of follows to leaves in regex parse tree (only changed data type; need to populate follows list later) 2025-10-10 03:49:09 +01:00
108a30ea79 add utility function to insert from a list into a set 2025-10-10 03:29:52 +01:00
88eb30dbf2 done caching firstpos and lastpos, and using the cached data 2025-10-10 01:56:54 +01:00
6e646bdffa begin computing firstpos and lastpos during parsing 2025-10-10 01:43:24 +01:00
3197315478 fix failing tests for escaping regex metacharacters 2025-10-09 06:22:21 +01:00
a5fec6f1a2 add tests for escape sequences 2025-10-09 06:06:07 +01:00
250ae239be begin adding tests for regex 2025-10-09 05:34:32 +01:00
0de7a9278a progress implementing help-prev-match for vector 2025-10-08 10:27:19 +01:00
3b823d7ae6 delete 'nextMatch' function in search-list.sml, and refactor other code to use alternative function 2025-10-08 08:16:20 +01:00
108e021fdb log an exception if search-thread encounters a failure 2025-10-08 06:51:52 +01:00
3c2e5812cd reimplement function to search through text from scratch 2025-10-08 06:35:49 +01:00
06106f5de8 remove 'searchString' field from app_type, because the same role is fulfilled by new 'dfa' field 2025-10-08 05:40:29 +01:00
8857f49537 pass DFA to 'SearchList.buildRange' function, so that we don't need to parse search string into DFA each time 2025-10-08 05:20:33 +01:00
7a72bc2ed1 done with allowing different types of endMarkers 2025-10-07 14:44:40 +01:00
f085860f20 begin making changes to return a parse error if regex string contains an end marker 2025-10-07 14:36:35 +01:00
060df2745a fix bugs: only wildcard and character-class-negation should check to see if curChr is an endmarker 2025-10-07 14:30:23 +01:00
c62e234d00 change dfa-gen to a functor, and use functor to instantiate different structures 2025-10-07 14:05:45 +01:00
075fec02be handle edge case in char range: escaped char followed by another escaped char 2025-10-07 12:28:14 +01:00
4dfee016eb handle edge case in char-range: in a range like a-z, the second character may be an escape sequence, and we need to handle that case if so 2025-10-07 12:13:41 +01:00
44c2fbb3c7 handle character ranges like a-z in character class and negated character class 2025-10-07 09:48:10 +01:00
3d4dbdda69 done adding functionality for parsing character classes 2025-10-07 09:31:24 +01:00
8eed2ef51a add support for [^negated_character_classes], although we don't parse them yet 2025-10-07 08:57:01 +01:00
d6142285da add handling for [character class] type (but note that we don't parse a character class yet) 2025-10-07 08:51:46 +01:00
56658a4a70 only convert char to int in dfa-gen.sml's 'convertChar' loop 2025-10-07 08:43:15 +01:00
71786a494c fix minor bug with escape sequences: we should pattern match on an unescaped char, and we should return an escaped char. For example, it makes sense to pattern match on plain unescaped /home/humza/Downloads/sml/shf/temp.txt"n" and return /home/humza/Downloads/sml/shf/temp.txt"\n". This is because user inputs escape-chars as a two-char sequence, prepended by a backslash \ character 2025-10-06 21:58:50 +01:00
cc5c0bf95c implement escape sequences for regex 2025-10-06 21:44:57 +01:00
0bcb3e1dfc done implementing '+' and '?' regex operators 2025-10-06 20:41:52 +01:00
dcd930f855 begin expanding ? and + regex symbols, which we can represent using a combination of the others 2025-10-06 17:08:57 +01:00
9dd44c8eca fix implementation of ZERO_OR_MORE (Kleene star) in dfa-gen.sml 2025-10-06 14:45:28 +01:00
2779b61c1f amendment to 'lastpos' function: if right child is not nullable, then get lastpos of right child, or else get union of them both 2025-10-06 13:50:34 +01:00