Results 1 to 6 of 6

Thread: Comparing Mid$ with string pointer

  1. #1

    Comparing Mid$ with string pointer

    Thanks Eros for the note about using the pointer for a string in the thread Advent of Code, 2017 by Petr, it is much speedier than Mid$
    i have tested a timing demo for a text of 270000 digits, when we use pointers its time is about 0.15 seconds but with Mid$ it is around 3.9 , ie 27 times speedier. i choose input String input = Repeat$(30000, "123456789") for reasonable waiting time
    if we replace 30000 by 300000, with pointer the time is about 1.6 sec. but with Mid$ it seems like Mid$ lost in the space for the text of 2,700,000 digits, even it should return after about 60 seconds, i have waited 5 minutes and then closed the console. it seems the time for the Mid$ is exponentially larger with the increasing length of the string.
    so i deduce Mid$ is suitable for everyday strings while pointer to string is suitable for huge strings
    Uses "console"
    Long total 
    
    Dim T1, T2 As Quad
    Extended T3
    
    String input = Repeat$(30000, "123456789")
    Long index
    Dim TXT(Len(input)) As String * 1 At StrPtr(input)
    PrintL "length of the numeric text= " + UBound(TXT)
    HiResTimer_Init
    T1 = HiResTimer_Delta
      For index = 1 To Len(input)
        
        total + TXT(index)
        'total + Mid$(input, index,1)
      Next
    T2 = HiResTimer_Delta
    PrintL "sum of digits= " + total
    
    T3 = T2-T1
    T3 = T3/1000000 'the one second = million microseconds
    PrintL LTrim$(Str$(T3))+ " (seconds)"
    WaitKey
    

  2. #2
    Super Moderator Petr Schreiber's Avatar
    Join Date
    Aug 2005
    Location
    Brno - Czech Republic
    Posts
    7,128
    Rep Power
    732
    Hi Primo,

    thanks for sharing this. It is important to keep this in spotlight, as it comes handy in many situations.

    Tip #1
    LTrim$(Str$(x)) is not needed. Just use Format$(x) for the same result.

    Tip #2
    HiResTimer_Delta is your friend, it is enough to call:
    HiResTimer_Get
    delta = HiResTimer_Delta
    
    ... to get th Delta from last call. This saves you 2 variables in this case.

    Tip #3
    You can build your benchmark routine and specific implementations then to functions.
    This allows automatical detection of those test functions and automatical benchmarking:
      Uses "console"
      uses "stat"
      
      function TBMain()
        ' -- Setup
        hiResTimer_Init   
        string input = repeat$(30000, "123456789")
        
        printL "length of the numeric text = " + len(input)
        printl
        
        ' -- List all user functions to array
        string functions()
        function_list(functions, %Function_User)
        
        ' -- Benchmark all functions matching signature "APPROACH_*"
        for f as long = 1 to countOf(functions) 
          if isLike(functions(f), "APPROACH_*", false) then
            Benchmark(functions(f), input)    
          end if
        next
        
        waitKey
      end function
      
      function Benchmark(functionName as string, parameter as string) as long    
        ext  delta
        long result
        
        hiResTimer_Get
        ' -- Call function by name
        call functionName(parameter) to result
        delta = hiResTimer_Delta ' Delta since last hiResTimer_* call
        
        printl lset$(functionName, 32 using " "),       ' -- Pad to left with spaces
               format$(delta/1000000, "0.000") + "s" +  ' -- Format it like we are interested in zero and 3 decimals
               ", result = " + result
        
        return result
        
      end function
      
      function Approach_Mid(byRef input as string) as long
        long result
        for index as long = 1 to len(input)
          result + mid$(input, index, 1)  
        next
        
        return result
      end function
      
      function Approach_String1Overlay(byRef input as string) as long
        dim TXT(len(input)) as string * 1 at strPtr(input)
        long result
        for index as long = 1 to len(input)
          result + TXT(index)
        next
        
        return result
      end function
      
      function Approach_Stat(byRef input as string) as long
        dim ascii(len(input)) as byte at strPtr(input)
        return stat_sum(ascii) - len(input)*48 ' -- Ha ha! ASCII values of numbers start at 48
      end function
    

    Petr
    Learn 3D graphics with ThinBASIC, learn TBGL!
    Windows 10 64bit - Intel Core i5-3350P @ 3.1GHz - 16 GB RAM - NVIDIA GeForce GTX 1050 Ti 4GB

  3. #3
    thinBasic author ErosOlmi's Avatar
    Join Date
    Sep 2004
    Location
    Milan - Italy
    Age
    57
    Posts
    8,777
    Rep Power
    10
    Interesting:

    length of the numeric text = 270000
    
    APPROACH_STRING1OVERLAY          0.293s, result = 1350000
    APPROACH_STAT                    0.004s, result = 1350000
    APPROACH_MID                     6.660s, result = 1350000
    
    Last edited by ErosOlmi; 27-12-2017 at 16:38.
    www.thinbasic.com | www.thinbasic.com/community/ | help.thinbasic.com
    Windows 10 Pro for Workstations 64bit - 32 GB - Intel(R) Xeon(R) W-10855M CPU @ 2.80GHz - NVIDIA Quadro RTX 3000

  4. #4
    thinBasic author ErosOlmi's Avatar
    Join Date
    Sep 2004
    Location
    Milan - Italy
    Age
    57
    Posts
    8,777
    Rep Power
    10
    To avoid to use "STAT" module you can substitute the following line
      Return stat_sum(ascii) - len(input)*48 ' -- Ha ha! ASCII values of numbers start at 48
    
    with the following line
      Return Array Sum Ascii - len(input)*48  ' -- Ha ha! ASCII values of numbers start at 48
    
    Also "Approach_String1Overlay" function can be optimized with the following.
    thinBasic will take care to convert string array elements into numbers before summing elements:
    function Approach_String1Overlay(byRef input as string) as long
      dim TXT(len(input)) as string * 1 at strPtr(input)
      Return Array Sum TXT
    end function
    
    This will perform 5x faster than original:
    length of the numeric text = 270000
    
    APPROACH_STRING1OVERLAY          0.077s, result = 1350000
    APPROACH_STAT                    0.005s, result = 1350000
    APPROACH_MID                     6.105s, result = 1350000
    
    www.thinbasic.com | www.thinbasic.com/community/ | help.thinbasic.com
    Windows 10 Pro for Workstations 64bit - 32 GB - Intel(R) Xeon(R) W-10855M CPU @ 2.80GHz - NVIDIA Quadro RTX 3000

  5. #5
    thinBasic author ErosOlmi's Avatar
    Join Date
    Sep 2004
    Location
    Milan - Italy
    Age
    57
    Posts
    8,777
    Rep Power
    10
    MID$ function is slow because every time the input string (first parameter of MID$) is parsed as a string expression and copied into another internal temporary string.
    If string is quite big it can takes a while.

    In the above example string is 270000 bytes and for/next takes place for each byte so 270000 times.
    Total number of bytes moved in memory is 270000^2 = 72900000000

    I'm trying to see if I found a way to understand if first parameter is not a string expression but a single scalar variable.
    In this case I could avoid to copy original string into a temp one but use the original string directly.
    www.thinbasic.com | www.thinbasic.com/community/ | help.thinbasic.com
    Windows 10 Pro for Workstations 64bit - 32 GB - Intel(R) Xeon(R) W-10855M CPU @ 2.80GHz - NVIDIA Quadro RTX 3000

  6. #6
    thinBasic author ErosOlmi's Avatar
    Join Date
    Sep 2004
    Location
    Milan - Italy
    Age
    57
    Posts
    8,777
    Rep Power
    10
    For the moment I've developed a new function called MIDF$ (fast version of MID$)
    Instead of accepting as first parameter a string expression it accept a scalar string variable.
    In this way it is able to determine string pointer and get needed bytes without the need to create a local temporary copy of the string expression.

    Results are this one:
    length of the numeric text = 270000
    
    APPROACH_STRING1OVERLAY                            0.243s, result = 1350000
    APPROACH_MIDF                                      0.220s, result = 1350000   <--------FAST version of MID$
    APPROACH_STAT                                      0.003s, result = 1350000
    APPROACH_MID                                       7.070s, result = 1350000
    APPROACH_STRING1OVERLAYWITHARRAYSUM                0.038s, result = 1350000
    
    MIDF$ unction will be present in next thinBasic update.

    Full code:
    Uses "console"uses "stat"
     
    function TBMain()
      ' -- Setup
      hiResTimer_Init  
      string input = repeat$(30000, "123456789")
       
      printL "length of the numeric text = " + len(input)
      printl
       
      ' -- List all user functions to array
      string functions()
      function_list(functions, %Function_User)
       
      ' -- Benchmark all functions matching signature "APPROACH_*"
      for f as long = 1 to countOf(functions) 
        if isLike(functions(f), "APPROACH_*", false) then
          Benchmark(functions(f), input)    
        end if
      next
       
      waitKey
    end function
     
    function Benchmark(functionName as string, parameter as string) as long   
      ext  delta
      long result
       
      hiResTimer_Get
      ' -- Call function by name
      call functionName(parameter) to result
      delta = hiResTimer_Delta ' Delta since last hiResTimer_* call
       
      printl lset$(functionName, 50 using " "),       ' -- Pad to left with spaces
             format$(delta/1000000, "0.000") + "s" +  ' -- Format it like we are interested in zero and 3 decimals
             ", result = " + result
       
      return result
       
    end function
     
    function Approach_Mid(byRef input as string) as long
      long result
      for index as long = 1 to len(input)
        result +  mid$(input, index, 1)
      next
       
      return result
    end function
    
    
    function Approach_MidF(byRef input as string) as long
      long result
      for index as long = 1 to len(input)
        result +  midf$(input, index, 1)  
      next
       
      return result
    end function
     
    function Approach_String1Overlay(byRef input as string) as long
      dim TXT(len(input)) as string * 1 at strPtr(input)
      long result
      for index as long = 1 to len(input)
        result + TXT(index)
      next
       
      return result
    end function
    
    
    function Approach_String1OverlayWITHArraySum(byRef input as string) as long
      dim TXT(len(input)) as string * 1 at strPtr(input)
      return Array sum TXT
    end function
     
    function Approach_Stat(byRef input as string) as long
      dim ascii(len(input)) as byte at strPtr(input)
      return stat_sum(ascii) - len(input)*48 ' -- Ha ha! ASCII values of numbers start at 48
    end function
    
    Last edited by ErosOlmi; 27-12-2017 at 23:06.
    www.thinbasic.com | www.thinbasic.com/community/ | help.thinbasic.com
    Windows 10 Pro for Workstations 64bit - 32 GB - Intel(R) Xeon(R) W-10855M CPU @ 2.80GHz - NVIDIA Quadro RTX 3000

Similar Threads

  1. Comparing different filtering algorithms
    By Petr Schreiber in forum TBGL General
    Replies: 5
    Last Post: 31-10-2017, 22:56
  2. Pointer to Variant
    By mike lobanovsky in forum Tips and Tricks
    Replies: 5
    Last Post: 17-12-2013, 11:35
  3. when it's time to use a pointer?
    By largo_winch in forum Sources, Templates, Code Snippets, Tips and Tricks, Do you know ...
    Replies: 3
    Last Post: 01-12-2011, 20:16
  4. Pointer methods
    By Richard in forum Real world situations and solutions using thinBasic
    Replies: 1
    Last Post: 05-02-2010, 00:48
  5. Type pointer var not supported by WITH?
    By Michael Hartlef in forum thinBasic General
    Replies: 9
    Last Post: 24-09-2008, 15:42

Members who have read this thread: 1

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •