Dot Product

damianmoz · Post by **damianmoz** » 2023-11-26, 2:38:41

Love this feature. In the description, you could also mention that the command, even as it stands, can be used to produce more accurate (components of a) cross product. But I do not understand what is meant by
...store the result in y0 or y1, where y0,y1 are elements of the result vector. The other
... element of y will receive a corresponding element of a third source vector c0,c1

Also, is there a way to recover the approximate error in the dot product command? How much extra work would it take to make this command an augmented operation such that it returns the sum and the error in that sum?

Or is it easier to instead provide the more general operation as:
... s += a0 * b0 + a1 * b1

Thanks.

Post by **agner** » 2023-11-26, 7:33:52

Damian, you need to use this instruction twice for a complex number multiplication or division. First, you calculate the real part of the product and place it in y0, where (y0,y1) is the result vector. Then you calculate the imaginary part and place it in y1, while retaining y0 by using (c0,c1) = (y0,0).

You are right that it can be useful for cross products as well, but this requires extra permutations.

Augmented addition is better handled by a separate instruction.

damianmoz · Post by **damianmoz** » 2023-11-26, 8:31:41

Thanks for the explanation about the real and imaginary parts. Silly me. I should have twigged to the context.

I realize that using it in a cross product needs extra permutations. Will they be incorporated. I thought that exploiting this instruction in as many applications as possible would yield the best ROI.

I will ponder the problem of augmented multiplication/addition within the computation of a dot product of a vector of length N, N > 2 or N >> 2 and get back to you with hopefully some wiser thoughts.

In the last year, I have been working on several routines, all of which compute
.... (s, ds) = a0 * b0 + a1 * b1

and then I work with both s and ds in various ways, not just in augmented additions.

damianmoz · Post by **damianmoz** » 2023-11-28, 12:28:37

Given two complex numbers:
... a0 + i a1
... b0 + i b1
then the real part of the complex multiplication is a component of the cross product:
... a0 * b0 - a1 * b1

So it appears to me that your proposed instruction already covers the basics of a cross product.

damianmoz · Post by **damianmoz** » 2023-11-28, 23:39:09

Is this an instruction (or a variant of it) that we need to suggest go into IEEE 754-2019.

I still think it needs to produce a tuple (y, dy) such that
... y + dy = +/-a0*b0 +/- a1*b1
for maximum generality to satisfy as many people within the discussion.

My 2c.

Post by **agner** » 2023-11-29, 5:57:43

A cross product has 3 dimensions. It requires more permutations. Permutations and data movement across vector lanes are expensive in hardware. Therefore, I have no plans to implement an instruction with more dimensions than 2.

Augmented multiplication is possible with FMA instructions:
Y = a * b
dY = FMA(a, b, -Y)

This can be combined with augmented addition to produce a longer sum of products.
I am not sure whether we need to add a dot product instruction to the IEEE754 standard. It is more important to discuss how to make long sums.

forwardcom forum

Dot Product

Dot Product

Re: Dot Product

Re: Dot Product

Re: Dot Product

Re: Dot Product

Re: Dot Product