PDA

View Full Version : non_greedy metacharacter "?"



primo
08-12-2016, 21:25
it seems "?" works with VBRegExp but not with RegExpr$
in this pattern c.*?a.*?t we want the sets of characters c followed by zero or more of any chars up to a then the same up to t. without ? it will give the largest pattern.
example:
qwccuuiaopttttuikclkjaaaooptoiuecatpp

using VBRegExp_Test_MatchesAndCollections.tbasic with the above string and the pattern c.*?a.*?t gives correct patterns like that in red color in the above string

'---The following code illustrates how to obtain a SubMatches collection from a regular
'---expression search and how to access its individual members.

uses "VBREGEXP"

dim lpRegExp as dword
dim lpMatches as dword
dim lpMatch as dword
dim strValue as string


'---Allocate a new regular expression instance
lpRegExp = VBREGEXP_New

'---Check if it was possible to allocate and if not stop the script
if isfalse lpRegExp then
MSGBOX 0, "Unable to create an instance of the RegExp object." & $crlf & "Script terminated"
stop
end if

'---Set pattern
VBRegExp_SetPattern lpRegExp, "c.*?a.*?t"
'---Set case insensitivity
VBREGEXP_SetIgnoreCase lpRegExp, -1
'---Set global applicability
VBREGEXP_SetGlobal lpRegExp, -1

'---Execute search
lpMatches = VBREGEXP_Execute(lpRegExp, "qwccuuiaopttttuikclkjaaaooptoiuecatpp")
IF ISFALSE lpMatches THEN
MSGBOX 0, "1. No match found"
else

dim nCount as long value VBMatchCollection_GetCount(lpMatches)
IF nCount = 0 THEN
MSGBOX 0, "2. No match found"
else

'---Iterate the Matches collection
dim I as long

strValue += "Total matches found: " & nCount & $CRLF & string$(50, "-") & $crlf
FOR i = 1 TO nCount

lpMatch = VBMatchCollection_GetItem(lpMatches, i)

IF ISFALSE lpMatch THEN EXIT FOR

strValue += "Match number " & i & " found at position: " & VBMatch_GetFirstIndex(lpMatch) & " length: " & VBMatch_Getlength(lpMatch) & $CRLF
strValue += "Value is: " & VBMatch_GetValue(lpMatch) & $CRLF
strValue += "--------------" & $CRLF

VBREGEXP_Release lpMatch

NEXT

MSGBOX 0, strValue

END IF

END IF

IF istrue lpMatches THEN VBREGEXP_Release(lpMatches)
IF istrue lpRegExp THEN VBREGEXP_Release(lpRegExp)


but using the same pattern with RegExpr$ does not work at all , the same thing with powerbasic.
below i test the pattern c.*\sa with the same string as above just to show that the code is works. but after that replace this pattern with this: c.*?a and it will not work.
note that c.*\sa.*\st does not work and this is why i have used 2 chars
this give a lesson that VBREGEXP is more correct, even i wish RegExpr$ works since it is smaller

Uses "UI"
Begin Const
%bRun
%lText

End Const
Dim hDlg As DWord
Function TBMain()
Dialog New Pixels,0, "regular expressions tests", 1, 1, 700, 500, _
%WS_DLGFRAME Or %DS_CENTER Or %WS_CAPTION Or %WS_SYSMENU Or %WS_OVERLAPPEDWINDOW, _
0 To hDlg
Control Add Textbox , hDlg, %lText, "click on the button", 5, 30, 600, 400, %WS_TABSTOP Or _
%ES_WANTRETURN Or _
%ES_MULTILINE Or _
%ES_AUTOHSCROLL Or _
%ES_AUTOVSCROLL Or _
%WS_VSCROLL Or _
%WS_HSCROLL Or _

Control Add Button , hdlg, %bRun,"Run",650,60,40,40,%WS_BORDER Or %WS_TABSTOP
Dialog Show Modal hDlg, Call dlgProc
End Function

CallBack Function dlgProc() As Long
Select Case CBMSG

Case %WM_COMMAND
Select Case CBCTL
Case %bRun
discover() ' call the discover function
End Select
Case %WM_CLOSE

End Select

End Function

Function discover()
Dim nLines,total,i, nStart As Long
Dim mytext, result As String

Dim inputText As String
inputText = "qwccuuiaopttttuikclkjaaaooptoiuecatpp"
nStart = 1
Control Set Text hDlg, %lText, "" 'erase the TextBox
Local position, length As Long
Do
RegExpr$("c.*\sa", inputText, nStart+length, position, length)
'RegExpr$("c.*?a", inputText, nStart+length, position, length)
result = Mid$(inputText, position, length)
Control Append Text hDlg, %lText, result & " " & $CRLF
nStart = position
Loop While position

Control Append Text hDlg, %lText, inputText & " " & $CRLF
End Function