The first article in the series provided some basic motivation for usage of user defined types (UDT): their ability to encapsulate multiple fields and straightforward memory consumption tracking.
The second part in the series will introduce you to dot notation, its benefits and pitfalls.
Dot notation
Designing the data model with UDT is very straightforward, as the first article suggested. Each field of the UDT can be then accessed via dotted syntax:
type vector2d x as single y as single end type dim myPoint as vector2d myPoint.x = 1 myPoint.y = 2
type line2d pointA as vector2d pointB as vector2d end type dim xAxis as line2d xAxis.pointA.x = -10 xAxis.pointA.y = 0 xAxis.pointB.x = 10 xAxis.pointB.y = 0
You can even inherit the fields of one UDT in another. This can be done in two ways:
type vector3d vector2d z as single end type ' -- or equivalent type vector3d extends vector2d z as single end type
The first approach allows specifying even multiple types to inherit fields from.
The second one allows inheriting from just one specified type, whose fields will be always inserted as first in the new type.
Memory representation
Another interesting property of user defined types is how they are stored in memory. The following example will give you little hint:
uses "console" type vector2d x as single y as single end type type vector3d extends vector2d z as single end type dim v as vector3d printL "v: " + varPtr(v) printL "v.x: " + varPtr(v.x) printL "v.y: " + varPtr(v.y) printL "v.z: " + varPtr(v.z) printL single x, y, z printL "x: " + varPtr(x) printL "y: " + varPtr(y) printL "z: " + varPtr(z) waitKey
- the pointer to the first field of UDT equals pointer to the UDT variable itself
- by default, the fields are tightly packed in memory, one after another
- the memory order of the fields is the same as the order in the declaration
- the memory order of separate x, y, z variables is completely random
The third point is very important, if you want to access the fields via pointers. While it is tempting to presume .x is available at +0 offset, .y at +4 and .z at +8, please do not hardcode these.
Never. Ever.
Instead, use UDT_ElementOffset function to calculate the offsets dynamically. This will make your code more maintainable in case you would like to reorder the fields in the UDT later, for aesthetical purposes, for example.
uses "console" type vector3d x as single y as single z As Single end type dim v(3) as vector3d dword v_pointer long i printL "Bad practice - try changing order of fields in vector3d" in 12 v_pointer = varptr(v(1)) for i = 1 to countof(v) poke(single, v_pointer + 0, 1 * i) poke(single, v_pointer + 4, 2 * i) poke(single, v_pointer + 8, 3 * i) printL v(i).x, v(i).y, v(i).z v_pointer += sizeOf(vector3d) next printL printL "Good practice - changing the order in vector3d does not affect result" in 10 dword x_offset = udt_elementoffset(v(1).x) dword y_offset = udt_elementoffset(v(1).y) dword z_offset = udt_elementoffset(v(1).z) printL "x offset: " + x_offset printL "y offset: " + y_offset printL "z offset: " + z_offset printL v_pointer = varptr(v(1)) for i = 1 to countof(v) poke(single, v_pointer + x_offset, 1 * i) poke(single, v_pointer + y_offset, 2 * i) poke(single, v_pointer + z_offset, 3 * i) printL v(i).x, v(i).y, v(i).z v_pointer += sizeOf(vector3d) next waitKey
While the dotted syntax is very clear to follow, and de-facto standard even in other languages, it comes at performance price.
Each time a dot is found, a calculation takes place to find the referenced field. The effect gets more serious, the more dots you have in your expression.
This is something many data modeling maniacs do not take in count, bringing them to wonder later.
With thinBasic, this can be completely eliminated with usage of pointers (and you can impress girls by living dangerously).
Have a look at this very synthetic benchmark to demonstrate the time difference:
uses "console" type vector3d x as single y as single z As Single end type dim v(1000000) as vector3d long i hiResTimer_Init quad t1, t2 print "Classic dotted aproach: " t1 = hiResTimer_Get for i = 1 to countOf(v) v(i).x = 1 * i v(i).y = 2 * i v(i).z = 3 * i next t2 = hiResTimer_Get printL format$((t2 - t1)\1000) + " ms" in 12 ' -- If you wonder, this is forced integer division ' dword v_pointer dword x_offset = udt_elementoffset(v(1).x) dword y_offset = udt_elementoffset(v(1).y) dword z_offset = udt_elementoffset(v(1).z) ' print "Virtual variable: " t1 = hiResTimer_Get single x at 0 single y at 0 single z at 0 v_pointer = varptr(v(1)) for i = 1 to countOf(v) setAt(x, v_pointer + x_offset) setAt(y, v_pointer + y_offset) setAt(z, v_pointer + z_offset) x = 1 * i y = 2 * i z = 3 * i v_pointer += sizeOf(vector3d) next t2 = hiResTimer_Get printL format$((t2 - t1)\1000) + " ms" in 14 print "Poking around: " t1 = hiResTimer_Get v_pointer = varptr(v(1)) for i = 1 to countOf(v) poke(single, v_pointer + x_offset, 1 * i) poke(single, v_pointer + y_offset, 2 * i) poke(single, v_pointer + z_offset, 3 * i) v_pointer += sizeOf(vector3d) next t2 = hiResTimer_Get printL format$((t2 - t1)\1000) + " ms" in 10 waitKey
User defined types provide great flexibility for data modeling, exposed via dot notation, in line with industry expected approach. While this syntactic sugar offers
great readability, it has its performance impact.
The article demonstrates possible approach to overcome this limitation for cases when performance is needed. While the approach is pointer based, it can be made safer by using UDT_ElementOffset function, ensuring proper field addressing.
Message