Results 1 to 8 of 8

Thread: Tail recursion optimization?

Hybrid View

Previous Post Previous Post   Next Post Next Post
  1. #1

    Question Tail recursion optimization?

    I have a tail recursive piece of code that could potentially call itself recursively up to thousands of times, and overall be executed up to millions of times each time the program is run. This raises some concerns regarding performance, and the size of the stack available to thinBasic.
    That's why I have these questions:
    • Does thinBasic optimize/eliminate tail recursive calls?
    • If the answer to the previous is no, do I run the risk of overflowing the stack, if the function I'm talking about takes one potentially 240-character long string, one 6-character long string, and one dword as arguments?
    • Is there any profiling information available regarding the FILE_Exists function from the FILE module? My recursive function involves it, so it could be called millions of times as well. I would like to have even a rough estimate as to how long it takes.



    This is the code:
    uses "FILE"
    $myseparator = "_-" 
    
    Function uniquefilename(unsafe As String) As String
    If Not FILE_Exists(unsafe) Then Return unsafe
    Return sequencefilename(getpathandfilename(unsafe), getdotext(unsafe),0)
    End Function
    
    Function sequencefilename(filename As String, dotext As String,_
                              seqnum As DWord) As String
    If Not FILE_Exists(filename & $myseparator & tob32(seqnum) & 
                       dotext) Then _
          Return filename & $myseparator & tob32(seqnum) & dotext                        
    Return sequencefilename(filename,dotext,seqnum+1)
    End Function
    
    tob32 is a function that encodes a number as a base 32 string. As for the other functions, getpathandfilename and getdotext, well, you can probably imagine what they do.

  2. #2
    thinBasic author ErosOlmi's Avatar
    Join Date
    Sep 2004
    Location
    Milan - Italy
    Age
    57
    Posts
    8,781
    Rep Power
    10
    Ciao,

    recursion is always a possible source of problems and will for sure slow down execution due to the necessity to allocate/deallocate local functions stack in a controlled way.
    thinBasic does not creates real compiled machine code applications so the way it handle local stack is completely different from compiled applications.
    In thinBasic stacks are an array of pointers (one for each stacked call to a function) to hash tables created/destroyed on the fly in heap memory, the same memory used by any other allocated variable.
    Every function call has its own hash table used to keep track of function local variables and parameters.

    That said, you need to make your tests in a test controlled environment and see what happen.
    Factors that can influence thinBasic capability to resists to many recursive calls and speed execution are many.
    On speed execution we can work to try to optimize it in some way (for example developing external compiled DLL to do part of the job)

    To your questions:
    1. no optimization takes place on recursive functions
    2. that risk is always there. Influenced factors are so many that is difficult to me to reply. That's why I said you need to try
    3. I'm not aware File_Exists has some performance issue but if you suspect or have evidences that it can be optimized, I can work on it.
      Internally it is developed like the following (Power Basic code):
        FUNCTION File_Exists(BYVAL FullFileName AS STRING) AS LONG    FUNCTION = %FALSE
          IF DIR$(FullFileName, %NORMAL OR %READONLY OR %HIDDEN OR %SYSTEM) = "" THEN EXIT FUNCTION
          FUNCTION = %TRUE
        END FUNCTION
      
      There are at least two point where I can optimize a bit (passing file name by reference and testing for length instead of string)


    Looking at your functions maybe you can get some heklp on File_PathSplit function: http://www.thinbasic.com/public/prod..._pathsplit.htm
    Calling a native thinBasic function is always hundred of times faster than calling script functions.

    Regarding profiling, if your script does not crash due to recursion ... thinBasic has a rough profiler that can help you.
    Just add the following line to your source code
    #Profile on, %TRUE
    
    and you will have the following: http://www.thinbasic.com/public/prod...ml?profile.htm
    It measure all script function call measuring execution time.
    It also save a .csv file in the same directory of your script.

    Hope this can help.
    Keep me informed, I will be happy to help on this massive task if I can.

    Ciao
    Eros
    www.thinbasic.com | www.thinbasic.com/community/ | help.thinbasic.com
    Windows 10 Pro for Workstations 64bit - 32 GB - Intel(R) Xeon(R) W-10855M CPU @ 2.80GHz - NVIDIA Quadro RTX 3000

  3. #3
    Thanks, Eros, for the advice about the File_Pathsplit function, and for the insight into how thinBasic operates internally. From your reply, I figure that a typical program would be more likely to run out of heap memory altogether, long before it could overflow the stack.

    I have to suggest, then, a seemingly easy optimization for tail calls (not just tail recursive): destroy the previous "local stack" and its pointer, and replace it with a pointer to the new local stack. This could decrease memory usage dramatically, at perhaps a slightly increased runtime cost (the time thinBasic takes to figure out if a call is a tail call or not). Or perhaps a "TAIL" or "tailcall" keyword could be introduced, to indicate to thinBasic that this optimization can be done on a given function or call.
    Though I don't know if this would be actually easy to implement in thinBasic.

    As for the recursive function I posted, I expect it to typically be executed less than a hundred times each time the program runs, so optimizing it isn't as high a priority as my tone might have implied. But then, you never know when you're going to find the madman who tries to create five thousand files, all named "%HOMEPATH%\a.txt"
    Still, I might run some tests one of these days.

    Also, congratulations and thanks for the #profile directive. That's a big win for thinBasic.

  4. #4
    thinBasic author ErosOlmi's Avatar
    Join Date
    Sep 2004
    Location
    Milan - Italy
    Age
    57
    Posts
    8,781
    Rep Power
    10
    Quote Originally Posted by Vandalf View Post
    I have to suggest, then, a seemingly easy optimization for tail calls (not just tail recursive): destroy the previous "local stack" and its pointer, and replace it with a pointer to the new local stack. This could decrease memory usage dramatically, at perhaps a slightly increased runtime cost (the time thinBasic takes to figure out if a call is a tail call or not). Or perhaps a "TAIL" or "tailcall" keyword could be introduced, to indicate to thinBasic that this optimization can be done on a given function or call.
    Though I don't know if this would be actually easy to implement in thinBasic.
    Yes, not easy at the moment mainly for my time. I want to release new version with new IDE and working on implementing any data type inside TYPE/END TYPE. That's my priority.
    But I will study a solution for tail calls.

    Quote Originally Posted by Vandalf View Post
    As for the recursive function I posted, I expect it to typically be executed less than a hundred times each time the program runs, so optimizing it isn't as high a priority as my tone might have implied. But then, you never know when you're going to find the madman who tries to create five thousand files, all named "%HOMEPATH%\a.txt"
    Still, I might run some tests one of these days.
    Le me know about your tests, I'm curious.

    Quote Originally Posted by Vandalf View Post
    Also, congratulations and thanks for the #profile directive. That's a big win for thinBasic.
    Thanks. It slow down execution a bit but can show interesting information for medium/complex scripts.
    www.thinbasic.com | www.thinbasic.com/community/ | help.thinbasic.com
    Windows 10 Pro for Workstations 64bit - 32 GB - Intel(R) Xeon(R) W-10855M CPU @ 2.80GHz - NVIDIA Quadro RTX 3000

  5. #5
    For reasons unknown, my copy of thinBasic is unable to generate timing data when profiling. I did try running it as administrator. All it ever shows in the time columns is 0.0000. The rest of the profiling data are shown correctly.
    In the end, I had to use a watch.

    In an empty folder, uniquefilenamev1 generated 201 files in 12.5 seconds.
    In an empty folder, uniquefilenamev2 generated 201 files in 16 seconds.
    This was all with #profile on.
    I did some more tests, but I don't count them because I forgot to note if #profile was on or off during those. Still, they showed the same tendency where v1 is faster than v2.

    The difference between the two variants is that v1 uses the tob32 function, which takes a number and returns a corresponding base 32* string, while v2 uses the b32plusone function, which takes a base 32* string and returns the equivalent of converting it to a number, adding one to it, and then applying the tob32 function to that new number.

    *Custom encoding, not conforming to RFC 4648.

    This is the script:
    Uses "FILE"          
    #PROFILE On                       
    $tob32_charset = "0123456789ABCDEFGHJKMNPQRTUVWXYZ"
    '0123456789ABCDEFGH JK MN PQR TUVWXYZ
    %tob32_charsetptr = StrPtr($tob32_charset) As DWord
    
    $testfilename = "C:\users\usr1\tst\ab.txt"               
    
    For i As Word=0 To 200
      FILE_Append( uniquefilenamev1( $testfilename ) , "hello world" )
    Next      
    
    MsgBox 0, "done"                                                               
    
    Function uniquefilenamev1(unsafe As String) As String
    If Not FILE_Exists(unsafe) Then Return unsafe
    Return sequencefilenamev1( _  
               FILE_PathSplit(unsafe, %PATH_ROOTPATHFILE), _
               "." & FILE_PathSplit(unsafe, %PATH_EXT), _
               0)
    End Function     
    
    Function uniquefilenamev2(unsafe As String) As String
    If Not FILE_Exists(unsafe) Then Return unsafe
    Return sequencefilenamev2( _  
               FILE_PathSplit(unsafe, %PATH_ROOTPATHFILE), _
               "." & FILE_PathSplit(unsafe, %PATH_EXT), _
               "0")
    End Function                              
                                
    Function sequencefilenamev1(filename As String, dotext As String, seqnum As Number) As String
    Dim retfilename As String = filename & "-" & tob32(seqnum) & dotext 
    If Not FILE_Exists(retfilename) Then Return retfilename                              
    Return sequencefilenamev1(filename,dotext,seqnum+1)
    End Function    
      
    Function sequencefilenamev2(filename As String, dotext As String, seqnum As String) As String
    Dim retfilename As String = filename & "-" & seqnum & dotext 
    If Not FILE_Exists(retfilename) Then Return retfilename                              
    Return sequencefilenamev2(filename,dotext,b32plusone(seqnum))
    End Function    
    
    Function tob32(nu As Number) As String
    Static n As DWord 
    Static res As String
    res=""
    Repeat
    n=nu And 31
    res= Memory_Get(%tob32_charsetptr+n,1) & res
    nu=SHIFT RIGHT nu, 5
    Until nu=0        
    Return res
    End Function                            
                                
    Function b32plusone(thestring As String) As String
    'returns a base 32 string with a value of one plus the given string's value 
    'the encoding it assumes is 0123456789ABCDEFGH JK MN PQR TUVWXYZ
    'it is case sensitive
    Static carry,tmp As Byte                
    Static i,j As DWord
    i= Len(thestring)
    j= StrPtr(thestring)-1    
    carry=0
    tmp=Asc(thestring,i)
    '                     HI,KL,NO,RS,9:,Z[ 
    If(In(Poke(j+i,tmp+1),73,76,79,83,58,91)>0) Then _
       If (In(Poke(j+i,tmp+2),59,92)>0) Then _
          If Poke(j+i,tmp+8)<>65 Then _
             carry=Poke(j+i,48)
    For i =i-1 To 1 Step -1 While ((carry>0) And (Asc(thestring,i)=90))
      Poke(j+i, 48)  
    Next          
    
    If (i<=0) Then 
      If (carry>0)Then Return "1" + thestring
    Else    
      tmp=Asc(thestring,i)
      If (carry>0) Then carry = 1
      If(In(Poke(j+i,tmp+carry),73,76,79,83,58)>0) Then _
        If Poke(j+i,tmp+2)=59 Then Poke(j+i,65)
    End If   
             
    Return thestring
    End Function
    
    Last edited by Vandalf; 21-03-2017 at 12:51.

  6. #6
    thinBasic author ErosOlmi's Avatar
    Join Date
    Sep 2004
    Location
    Milan - Italy
    Age
    57
    Posts
    8,781
    Rep Power
    10
    Ciao Vandalf,

    very interesting script and needs.
    I think Profiler is not showing timing due to impossibility (for some reasons) to use high resolution timing functions: https://msdn.microsoft.com/en-us/lib...(v=vs.85).aspx
    For future versions I will add some features allowing to understand if high resolution timing functions are available and if not automatically switch to a less precise timing.

    I've simplified a little your script in order to focus on few aspects.
    On my computer creating 201 files on an empty directory takes about 3.5 seconds.
    I will be into a 2 days meeting at work but I will recap this in the week-end.
    My idea is to try to optimize (if possible) some functions already used here in your script and/or develop new specific functions in Core or File modules.

    I will let you know.

    Thanks
    Eros

    Uses "FILE"
    uses "Console"
    
    
    #PROFILE On
    
    
    $tob32_charset = "0123456789ABCDEFGHJKMNPQRTUVWXYZ"
    '0123456789ABCDEFGH JK MN PQR TUVWXYZ
    %tob32_charsetptr = StrPtr($tob32_charset) As DWord
    
    
    '---Create a path under script directory where to save files 
    $testPath = APP_ScriptPath & "Profile_NoTiming\"
    DIR_MakeAll($testPath)
    
    
    '---Initial file name
    $testfilename = $testPath & "ab.txt"
    
    
    string sFileName
    
    
    double t0, t1, t2, t9
    
    
    t0 = Timer
    printl "Starting time:", Time$
    For i As Word=0 To 200
      t1 = Timer
      sFileName = uniquefilenamev1( $testfilename )
      t2 = timer
      printl i, sFileName, format$(T2 - T1, "#0.0000"), "secs"
      FILE_append(sFileName, "hello world")
    Next
    printl "Ending time:", Time$
    t9 = Timer
    
    
    printl "All done in", format$(T9 - T0, "#0.0000"), "secs"
    WaitKey
    
    
    
    
    Function uniquefilenamev1(unsafe As String) As String
      If Not FILE_Exists(unsafe) Then
        function = unsafe
      Else
        function = sequencefilenamev1( _
                 FILE_PathSplit(unsafe, %PATH_ROOTPATHFILE), _
                 "." & FILE_PathSplit(unsafe, %PATH_EXT), _
                 0)
      end if
    End Function
     
    Function sequencefilenamev1(filename As String, dotext As String, seqnum As long) As String
      string retfilename = filename & "-" & tob32(seqnum) & dotext
    
    
      If Not FILE_Exists(retfilename) Then
        function = retfilename
      Else
        function = sequencefilenamev1(filename, dotext, seqnum + 1)
      end If
    End Function
    
    
    Function tob32(nu As long) As String
      Static n As DWord
      dim res As String
      
      'res = ""
      Repeat
        n = nu And 31
        res = Memory_Get(%tob32_charsetptr + n, 1) & res
        nu = SHIFT RIGHT nu, 5
      Until nu = 0
      
      function = res
    End Function
    
    www.thinbasic.com | www.thinbasic.com/community/ | help.thinbasic.com
    Windows 10 Pro for Workstations 64bit - 32 GB - Intel(R) Xeon(R) W-10855M CPU @ 2.80GHz - NVIDIA Quadro RTX 3000

Similar Threads

  1. Tree Recursion
    By peter in forum Sources, Templates, Code Snippets, Tips and Tricks, Do you know ...
    Replies: 13
    Last Post: 12-05-2017, 08:58
  2. Tree Recursion 2
    By primo in forum TBGL General
    Replies: 1
    Last Post: 21-02-2016, 10:38
  3. A nice (verbal) example of recursion
    By RobbeK in forum General
    Replies: 2
    Last Post: 25-03-2014, 02:05
  4. Various types of recursion
    By Petr Schreiber in forum Sources, Templates, Code Snippets, Tips and Tricks, Do you know ...
    Replies: 6
    Last Post: 25-05-2010, 06:38
  5. Array vs type optimization
    By MouseTrap in forum Power Basic
    Replies: 6
    Last Post: 04-03-2009, 11:05

Members who have read this thread: 0

There are no members to list at the moment.

Tags for this Thread

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •