Very nice Charles,
on my good old Sempron ticking at 1.8GHz it takes not more than 0.410 seconds.
Do you plan adding word count for each word?
Petr
Hi Everybody,
Here is my current effort, using a proper hash table to check for unique words - which takes up most of the time. I'm using an 8 byte hash of each word. When a word is identified against a hash it is further checked against the original word. On the Bible with an HP Pavillion desktop it takes 0.370 seconds including compile time (0.06 Secs). With an 8 byte hash, the second part of the word check proves to be unnecessary, and omiting will save another 0.05 seconds. Surprisingly, sorting only takes up 0.015 seconds.
Charles
PS I have not altered much recently but to make sure it all works:
Oxygen Update: http://community.thinbasic.com/index.php?topic=2517
Very nice Charles,
on my good old Sempron ticking at 1.8GHz it takes not more than 0.410 seconds.
Do you plan adding word count for each word?
Petr
Learn 3D graphics with ThinBASIC, learn TBGL!
Windows 10 64bit - Intel Core i5-3350P @ 3.1GHz - 16 GB RAM - NVIDIA GeForce GTX 1050 Ti 4GB
Charles:
You'r version is a real screamer!
I tried it on two different BIBLE.TXT files on my "beast" and got the following results---
COMPILE: 47.622 microsec 46.630 microsec
RUN: 217.440 microsec 203.313 microsec
TOTAL: 265.062 microsec 249.943 microsec
WORD COUNT: 12,860 13,172
DELL XPS-1710
XP Pro SP-2
Intel Duo T7600 @ 2.33 GHZ
2.00 GB ram
XPS 1710
Amazing optimizations you guys are coming up with in standard thinBasic and then Charles comes in with an incredible time with Oxygen.
I started to scan through the list generated by the program of sorted words and it is just amazing that the computer can do all of that so quickly.
I guess that is why we love our computers, they sure can do amazing stuff with the right amazing code!!
Acer Notebook: Win 10 Home 64 Bit, Core i7-4702MQ @ 2.2Ghz, 12 GB RAM, nVidia GTX 760M and Intel HD 4600
Raspberry Pi 3: Raspbian OS use for Home Samba Server and Test HTTP Server
Here is a more streamlined O2 version (mk 8 ).
I've removed some redundant code and combined one or two of the methods. The verify procedure has been switched out as I am confident it is not needed with 8 byte hashing. This brings the time down from 0.370 to 0.280 seconds.
However there seem to be some positional effects - depending on the size of the source code string - even comments appear to affect the performance . I can't pin it down yet.
Charles
PS I'll put the word counts in later, Petr.
hi all bible tester
have tuned my version and lost (win!) nearly four seconds in the thinbasic conventional way (meaning: without oxygen) I have made simple some changes and adding new stuff
1) perhaps anybody can check the speed with a fast machine and test the script, would be nice...my suggestion aims to around 4 until 5 seconds on a power machine, but I am not sure
2) the second version includes oxygen, but only a tricky way to use it... check it too
Ciao, have all a nice and sunny day, Lionheart
ps: it's nearly frustrating to see charles result about fantastic 0.37 sec... uhps... oxygen alien and groovy like !
you can't always get what you want, but if you try sometimes you might find, you get what you need
By changing most variables from local to static then reworking the hash coder and word reader - further reductions have been achieved. The overall speed has come down from 0.270 seconds to 0.222 which means the run time is now 0.165 seconds
Wherever possible data is loaded into the CPU 4 in character morsels at a time instead of single bytes and many words will be processed without code loops.
Charles
This version is included with the latest Oxygen as reading9.tbasic
http://community.thinbasic.com/index.php?topic=2517
Hi Charles,
0.230 total time on my Sempron, amazing result!
Is getFile new Oxygen native function?
Petr
Learn 3D graphics with ThinBASIC, learn TBGL!
Windows 10 64bit - Intel Core i5-3350P @ 3.1GHz - 16 GB RAM - NVIDIA GeForce GTX 1050 Ti 4GB
Ok, I will never go even closer to Charles code but here it is my last try in bibble word count.
Attached script takes advantages from latest thinBasic beta 1.7.8.0 you can get here: http://community.thinbasic.com/index.php?topic=2588.0
so it is mandatory to download it in order to test this script.
In particular the new statement ARRAY UNIQUE ...
that in one single line of code does mainly all the job to find unique words and count them
[code=thinbasic] array unique Words(), cWords(), ascend, lWords()[/code]
I think this new feature is enough general to be useful in different places where programmer has to classify or count elements.
For the moment it is limited to dynamic string arrays but once tested enough it will be easily expanded to work on any kind of arrays.
Also some visible improvements in REPLACE$ thanks to the help of Petr who sent me an optimized version.
I think also JOIN$ will have visible improvements.
Anyhow at the end I was able to go down to around 3.1 seconds from the previous 6.5 or so. Hope you like it.
Ciao
Eros
www.thinbasic.com | www.thinbasic.com/community/ | help.thinbasic.com
Windows 10 Pro for Workstations 64bit - 32 GB - Intel(R) Xeon(R) W-10855M CPU @ 2.80GHz - NVIDIA Quadro RTX 3000
Just installed latest ThinBasic.
Your original code ran 5.8 seconds, new version 2.7 seconds!
Learn 3D graphics with ThinBASIC, learn TBGL!
Windows 10 64bit - Intel Core i5-3350P @ 3.1GHz - 16 GB RAM - NVIDIA GeForce GTX 1050 Ti 4GB
Bookmarks