Thanks for the update Charles.
Hi Charles,
thanks a lot for your work, I am looking forward the public version
I am also very happy that problem with colors is gone, I was investigating on that topic and could not find any logical reason.
So great!
Petr
Learn 3D graphics with ThinBASIC, learn TBGL!
Windows 10 64bit - Intel Core i5-3350P @ 3.1GHz - 16 GB RAM - NVIDIA GeForce GTX 1050 Ti 4GB
Thanks for the update Charles.
Thanks Charles, hope to tinker with this module soon!
Acer Notebook: Win 10 Home 64 Bit, Core i7-4702MQ @ 2.2Ghz, 12 GB RAM, nVidia GTX 760M and Intel HD 4600
Raspberry Pi 3: Raspbian OS use for Home Samba Server and Test HTTP Server
Came across some useful information about the different varieties of SIMD extension:
http://softpixel.com/~cwright/programming/simd/
It might be a good idea to add these instructions to the set at some stage even though they are vendor-specific.
Note that AMD did not support SSE2 until 2003 with the Athlon64 and Opteron processors.
Hi Charles,
yes, SSE2 are "quite" new on AMDs. But if you check, many new games support GeForce 5 ( FX ) and up, which is also 2003 stuff. So for wanton developers SSE2 availability is no problem as they look no more than 5 years back .
I could not play lot of games on my Duron PC, because they required SSE or SSE2, do you think they gave me error message SSE not found ? No ... I realised it just because I checked my new CPU features and it seems it differed mainly in implementation of this tech.
But as in thinBASIC with Asmosphere we can decide which assembly to use on the fly, script can contain both SSE and non-SSE version and use the most appropriate one.
Thanks,
Petr
P.S. Great website!
Learn 3D graphics with ThinBASIC, learn TBGL!
Windows 10 64bit - Intel Core i5-3350P @ 3.1GHz - 16 GB RAM - NVIDIA GeForce GTX 1050 Ti 4GB
Asmosphere has passed all the dry tests so I have posted the latest version to the beginning of this thread. This version has the full Intel instruction set including SSE2. I will follow on shortly with the more CPU-specific instructions including SSE3 SSE4 and 3DNow. As before the full source code is included. There is also a test list with all the instructions in various modes, but don't try to run it! That would be like going into a chemistry lab and mixing all the chemicals together.
This assembler is quite good at trapping and reporting errors - about a quarter of its code is dedicated to this function, but with the complexity of the instruction set, and the irregularities, this is essential.
My next task is to do a substantial piece of code, making use of the preprocessor, and seeing how well it stands up in This assembler is quite good at trapping and reporting errors - about a quarter of its code is dedicated to this function, but with the complexity of the instruction set, and the irregularities, this is essential.
My next task is to do a substantial piece of code, making use of the preprocessor, to see how well it stands up in practice. Something in Opengl will be a good candidate.. The preprocessor code looks very similar to Basic or C so it should be possible to borrow large hunks of source code without too many alterations.
Thanks a lot Charles,
looks pretty good.
I can't wait to put my fingers on SSE!
Thanks,
Petr
Learn 3D graphics with ThinBASIC, learn TBGL!
Windows 10 64bit - Intel Core i5-3350P @ 3.1GHz - 16 GB RAM - NVIDIA GeForce GTX 1050 Ti 4GB
The level you have reached is astonishing Charles.
I will release Oxygen in next thinBasic release.
Hope to see real life usage of this fantastic module, like, for example, speeding big loops calculations and other critical areas where speed is a must or the impossioble can become possible.
Thanks again.
Eros
www.thinbasic.com | www.thinbasic.com/community/ | help.thinbasic.com
Windows 10 Pro for Workstations 64bit - 32 GB - Intel(R) Xeon(R) W-10855M CPU @ 2.80GHz - NVIDIA Quadro RTX 3000
Thanks Charles. Hope we mortals can make use of this power!
Acer Notebook: Win 10 Home 64 Bit, Core i7-4702MQ @ 2.2Ghz, 12 GB RAM, nVidia GTX 760M and Intel HD 4600
Raspberry Pi 3: Raspbian OS use for Home Samba Server and Test HTTP Server
Hi Charles,
I started to play with SSE - found tutorial here:
http://www.neilkemp.us/v3/tutorials/SSE_Tutorial_1.html
I tried to convert following code:
[code=c]
// A 16byte = 128bit vector struct
struct Vector4
{
float x, y, z, w;
};
// Add two constant vectors and return the resulting vector
Vector4 SSE_Add ( const Vector4 &Op_A, const Vector4 &Op_B )
{
Vector4 Ret_Vector;
__asm
{
MOV EAX Op_A // Load pointers into CPU regs
MOV EBX, Op_B
MOVUPS XMM0, [EAX] // Move unaligned vectors to SSE regs
MOVUPS XMM1, [EBX]
ADDPS XMM0, XMM1 // Add vector elements
MOVUPS [Ret_Vector], XMM0 // Save the return vector
}
return Ret_Vector;
}
[/code]
To thinBASIC:
[code=thinbasic]
uses "Oxygen"
type Txyzw
x as single
y as single
z as single
w as single
end type
dim v1, v2, resultV3 as Txyzw
v1.x = 0
v1.y = 0
v1.z = 0
v1.w = 0
v2.x = 1
v2.y = 1
v2.z = 1
v2.w = 1
dim SSE_Demo1 as string = "
MOV EAX, [#v1] ' Load pointers into CPU regs
MOV EBX, [#v2]
movups XMM0, [EAX] ' Move unaligned vectors to SSE regs
movups XMM1, [EBX]
addps XMM0, XMM1 ' Add vector elements
movups [#resultV3], XMM0 ' Save the return vector
ret
"
dim mc_SSE_Demo1 as string = O2_asm(SSE_Demo1)
if mc_SSE_Demo1 = chr$(&hc3) then
msgbox 0, "Assembly error"+$CRLF+O2_Error
else
MC_Exec(mc_SSE_Demo1)
msgbox 0, STR$(resultV3.x)+STR$(resultV3.y)+STR$(resultV3.z)
endif
[/code]
Maybe it is UDT handling problem / EBX register use?, not sure.
Thanks,
Petr
Learn 3D graphics with ThinBASIC, learn TBGL!
Windows 10 64bit - Intel Core i5-3350P @ 3.1GHz - 16 GB RAM - NVIDIA GeForce GTX 1050 Ti 4GB
Bookmarks