Results 1 to 6 of 6

Thread: Mixed Case Formatter

  1. #1
    thinBasic MVPs kryton9's Avatar
    Join Date
    Nov 2006
    Location
    Naples, Florida & Duluth, Georgia
    Age
    68
    Posts
    3,865
    Rep Power
    404

    Mixed Case Formatter

    I ended up with two versions. Instead of confusing things, I will post here just one which I think is the best way to go.
    First the code followed by usage:
    Uses "ui"
    Uses "File"
    Uses "Console"
    Dim Source , List As String
    Dim Filter , s , l As String
    Dim lcs, lcl As String
    Dim SourceNumLines , x , i , ListNumLines As Long
    
    Filter = "ThinBasic Files ( *.tBasic , *.tBasicc ) |*.tBasic;*.tBasicc|"
    Filter += "All Files ( *.* ) |*.*"
    Source = Dialog_OpenFile ( 0 , "Open a Source File" , Dir_GetCurrent , Filter , "tBasic" , %Ofn_FileMustExist Or %Ofn_HideReadOnly Or %Ofn_EnableSizing ) 
    List = App_SourcePath + "MixedCaseWordListv2.txt"
    
    Dim Sources As String Value File_Load ( Source ) 
    Dim Lists As String Value File_Load ( List ) 
    Dim FileOut As String Value "OutFileMixedCase.txt"
    Dim OutPutText As String
    Dim SourceLines ( ) , ListLines ( ) As String
    
    Dim Count As Long
    Dim pos As Long Value -1
    Dim nTimes As dWord = 1
    
    SourceNumLines = ParseCount ( Sources , $crlf ) 
    SourceNumLines = Parse ( Sources , SourceLines , $crlf ) 
    
    ListNumLines = ParseCount ( Lists , $crlf ) 
    ListNumLines = Parse ( Lists , ListLines , $crlf ) 
    
    For Count = 1 To SourceNumLines
     Console_Cls
     Print "Working" + $crlf
     Print Count + " / " + SourceNumLines
     For x = 1 To Len ( SourceLines ( Count ) ) 
      SourceLines ( Count ) = lCase$ ( SourceLines ( Count ) )
     Next
     s = SourceLines ( Count ) 
      For i = 1 To ListNumLines
        l = ListLines(i)
        lcs = lCase$(s)
        lcl = lCase$(l)
        While pos < Len(s)
         pos = Instr( lcs , lcl ,nTimes)
         If pos > 0 Then
          Mid$(s,pos,Len(l)) = ListLines(i)
          Incr nTimes
         End If
         If pos = 0 Then Exit While
        Wend
        pos = -1
        nTimes = 0
      Next
      OutPutText += s + $crlf  
    Next
    File_Save ( App_SourcePath + FileOut , OutPutText )
    
    This program will ask you to open the script file you want to convert to mixed case.
    It will then use "MixedCaseWordListv2.txt" to format your script and
    put the a new file to "OutFileMixedCase.txt"

    The beauty of this work flow is that all the complicated logic that would be involved is all handled by your word list.
    I will attach an example list that is by no mean complete. But at least you can get an idea of how your list should be created and maintained.
    Basically, override of previous words is done by how much further down the list the word is located.
    So short words which appear often and will mess up the look: as to if on
    will get overwritten when needed by
    was ton gif won
    which can be overwritten by words further down the list
    wash button gift wonder

    to make things easy, the attached file has a msworks spreadsheet. MSworks comes for free on all xp machines as far as I know.
    I set up a simple sheet which takes your word list, column b and puts the length of the word in column a
    Just add words to the bottom of your list.
    Use fill down to copy the formula into the new rows for column a and then sort
    col a asc
    col b asc
    Then select all your words in column b and paste them into "MixedCaseWordListv2.txt"

    Probably seems a lot more scary than it is actually to do.
    Attached Files Attached Files

  2. #2
    thinBasic MVPs kryton9's Avatar
    Join Date
    Nov 2006
    Location
    Naples, Florida & Duluth, Georgia
    Age
    68
    Posts
    3,865
    Rep Power
    404

    Re: Mixed Case Formatter

    Ok, just updated it so it can clean test.tbasic and itself :makeMixedCaseV2.tbasic

    If you look at the code, it takes what would be tough to do otherwise, like determine when to make TBGL, TBGL, or tbgl. But this 2 pass method really makes it easy.

    Remember, check to see if a word or keyword is in the big list first. For instance I had to add, Dim, Len etc.
    Then run the program and check to see what happens to the new words added to the big list.

    Then if not correct, copy the incorrect version and past it into the word list 2, with the correction.
    incorrect,correct

    The latest version will always be in the first post.
    Acer Notebook: Win 10 Home 64 Bit, Core i7-4702MQ @ 2.2Ghz, 12 GB RAM, nVidia GTX 760M and Intel HD 4600
    Raspberry Pi 3: Raspbian OS use for Home Samba Server and Test HTTP Server

  3. #3
    thinBasic MVPs kryton9's Avatar
    Join Date
    Nov 2006
    Location
    Naples, Florida & Duluth, Georgia
    Age
    68
    Posts
    3,865
    Rep Power
    404

    New Idea not up yet

    I got a new idea that kicked in while updating the lists in the other style.

    I now take the big list And then sort it by the word length in ascending order and then by the words in alphabetical order.
    This way, let's take the word begin
    when it sees
    Be and then In later and possibly Gin even later on. Don't forget sorted by length and then by alphabet. So it would go something like this.

    begin
    Begin
    BegIn
    BeGin
    Begin
    Since Begin is 5 characters long, it overrides the first shorter matches.

    I can't use replace$ for this so I am using instr and some mid$ trickery.
    So far initial tests are good.

    Still tinkering with tests and debug sessions tracing it.

    But wanted to let you guys know maybe onto another route from the first, which works, but we have to really adjust for many scripts till we build up a great second override list.

    If this second way works, only 1 list is needed.
    Acer Notebook: Win 10 Home 64 Bit, Core i7-4702MQ @ 2.2Ghz, 12 GB RAM, nVidia GTX 760M and Intel HD 4600
    Raspberry Pi 3: Raspbian OS use for Home Samba Server and Test HTTP Server

  4. #4
    Super Moderator Petr Schreiber's Avatar
    Join Date
    Aug 2005
    Location
    Brno - Czech Republic
    Posts
    7,146
    Rep Power
    735

    Re: Mixed Case Formatter

    Hi Kent,

    very interesting proggie.
    Sometimes it has a bit funny output, like:
    fOlder
    cOnSole_wrItelIne
    The biggest "danger" is that it affects even string literals. Maybe better to use tokenizer to get the words, then analyse if they are ok to replace or string literal and then you can safely REPLACE$ just the token you handle.


    Thanks,
    Petr
    Learn 3D graphics with ThinBASIC, learn TBGL!
    Windows 10 64bit - Intel Core i5-3350P @ 3.1GHz - 16 GB RAM - NVIDIA GeForce GTX 1050 Ti 4GB

  5. #5
    thinBasic MVPs kryton9's Avatar
    Join Date
    Nov 2006
    Location
    Naples, Florida & Duluth, Georgia
    Age
    68
    Posts
    3,865
    Rep Power
    404

    Re: Mixed Case Formatter

    Petr, that is what I meant by garbled and then you can fix it in word list two.

    The second method I am developing is the answer. There is a bug that is hard to trace even with the debugger as it goes through so many words it is too tedious to do step by step. Hopefully today I can solve the puzzle and have a nice working version.
    Acer Notebook: Win 10 Home 64 Bit, Core i7-4702MQ @ 2.2Ghz, 12 GB RAM, nVidia GTX 760M and Intel HD 4600
    Raspberry Pi 3: Raspbian OS use for Home Samba Server and Test HTTP Server

  6. #6
    thinBasic MVPs kryton9's Avatar
    Join Date
    Nov 2006
    Location
    Naples, Florida & Duluth, Georgia
    Age
    68
    Posts
    3,865
    Rep Power
    404

    Re: Mixed Case Formatter

    The program is finished. I will change my first post and fill in the details there.
    Acer Notebook: Win 10 Home 64 Bit, Core i7-4702MQ @ 2.2Ghz, 12 GB RAM, nVidia GTX 760M and Intel HD 4600
    Raspberry Pi 3: Raspbian OS use for Home Samba Server and Test HTTP Server

Members who have read this thread: 1

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •