feat: add `blas/base/cgemv` by DivitJain26 · Pull Request #9839 · stdlib-js/stdlib

DivitJain26 · 2026-01-20T11:00:55Z

Progresses #2039.

Description

What is the purpose of this pull request?

This pull request:

This pull request fixes correctness issues in the JavaScript implementation of cgemv for ndarray inputs.

Related Issues

Does this pull request have any related issues?

This pull request has the following related issues:

Questions

Any questions for reviewers of this pull request?

No.

Other

Any other information relevant to this pull request? This may include screenshots, references, and/or implementation notes.

No.

Checklist

Please ensure the following tasks are completed before submitting this pull request.

Read, understood, and followed the contributing guidelines.

AI Assistance

When authoring the changes proposed in this PR, did you use any kind of AI assistance?

Yes
No

If you answered "yes" above, how did you use AI assistance?

Code generation (e.g., when writing an implementation or fixing a bug)
Test/benchmark generation
Documentation (including examples)
Research and understanding

Disclosure

If you answered "yes" to using AI assistance, please provide a short disclosure indicating how you used AI assistance. This helps reviewers determine how much scrutiny to apply when reviewing your contribution. Example disclosures: "This PR was written primarily by Claude Code." or "I consulted ChatGPT to understand the codebase, but the proposed changes were fully authored manually by myself.".

@stdlib-js/reviewers

stdlib-bot · 2026-01-20T11:01:07Z

Hello! Thank you for your contribution to stdlib.

We noticed that the contributing guidelines acknowledgment is missing from your pull request. Here's what you need to do:

Please read our contributing guidelines.
Update your pull request description to include this checked box:

- [x] Read, understood, and followed the [contributing guidelines](https://github.com/stdlib-js/stdlib/blob/develop/CONTRIBUTING.md)

This acknowledgment confirms that you've read the guidelines, which include:

The developer's certificate of origin
Your agreement to license your contributions under the project's terms

We can't review or accept contributions without this acknowledgment.

Thank you for your understanding and cooperation. We look forward to reviewing your contribution!

kgryte · 2026-02-02T10:47:21Z

+					if ( trans === 'conjugate-transpose' ) {
+						aij = conjf( aij );
+					}
+					y.set( caddf( y.get( iy ), cmulf( aij, tmp ) ), iy );


Computing this way is not what we want. See zaxpy for an implementation which avoids complex instance materialization. caxpy still needs to be updated to use @stdlib/complex/float32/base/mul-add. You need to reinterpret as a real-valued strided array and then pass values to muladd.

Addressed in c09f790
All operations now use reinterpreted real/imag views. No complex instances are created inside hot loops, and accumulation uses muladd.assign.

kgryte · 2026-02-02T10:47:55Z

+				iy = offsetY;
+				for ( i0 = 0; i0 < ylen; i0++ ) {
+					aij = A.get( ia );
+					if ( trans === 'conjugate-transpose' ) {


Try to avoid burying conditionals within hot loops. If you need to duplicate loops, so be it.

Addressed in c09f790
Transpose/conjugation logic is resolved before entering inner loops. No conditionals remain inside performance-critical loops.

kgryte · 2026-02-02T10:48:45Z

+			} else {
+				iy = offsetY;
+				for ( i0 = 0; i0 < ylen; i0++ ) {
+					aij = A.get( ia );


Avoid materializing complex number instances. Pull directly real and imaginary components are real-valued reinterpretation, as in zaxpy.

Complex values are handled via scalar real/imag components from reinterpreted views.

kgryte · 2026-02-02T10:49:46Z

+		ia = offsetA;
+		ix = offsetX;
+		for ( i1 = 0; i1 < xlen; i1++ ) {
+			tmp = cmulf( alpha, x.get( ix ) );


Before entering this loop, you need to decompose alpha and beta into real and imaginary components.

You'll likely want tmp to be Float32Array workspace to which you write into via cmul.assign (or similar). This wouldn't be threadsafe in C, but in JS, it is fine.

alpha and beta are decomposed into real and imaginary parts prior to iteration.

kgryte · 2026-02-02T10:50:48Z

+		ix = offsetX;
+		for ( i1 = 0; i1 < xlen; i1++ ) {
+			tmp = cmulf( alpha, x.get( ix ) );
+			if ( scabs1( tmp ) === 0.0 ) {


When you do, this line will change. When implementing complex number routines in JS, we do not exactly follow Fortran logic. Why? Because JS does not have built-in support for complex numbers.

Implementation now follows stdlib’s explicit real/imag arithmetic pattern rather than Fortran-style logic.

kgryte · 2026-02-02T10:58:57Z

+		ylen = M;
+	}
+	// y = beta*y
+	if ( scabs1( beta ) === 0.0 ) {


Yes, scabs1 will work here, but I am not sure why we want to expend the extra effort. If you decompose alpha and beta into their respective components, then you can test the components directly.

This comment applies here and elsewhere.

Zero checks now test scalar components directly.

kgryte · 2026-02-02T11:00:16Z

+* cgemv( 'no-transpose', 2, 3, alpha, A, 3, 1, 0, x, 1, 0, beta, y, 1, 0 );
+* // y => <Complex64Array>[ 7.0, 0.0, 16.0, 0.0 ]
+*/
+function cgemv(trans, M, N, alpha, A, strideA1, strideA2, offsetA, x, strideX, offsetX, beta, y, strideY, offsetY) { // eslint-disable-line max-params, max-len


Much of your spacing here and below is incorrect and deviates from stdlib conventions.

Formatting updated to match stdlib conventions.

stdlib-bot · 2026-02-16T17:21:44Z

Coverage Report

Package	Statements	Branches	Functions	Lines
blas/base/cgemv	$\color{red}587/589$ $\color{green}+0.00%$	$\color{red}76/79$ $\color{green}+0.00%$	$\color{green}4/4$ $\color{green}+0.00%$	$\color{red}587/589$ $\color{green}+0.00%$

The above coverage report was generated for the changes in this PR.

DivitJain26 · 2026-02-16T17:39:26Z

Hi @kgryte ,

I’ve implemented the requested changes:

Removed complex instance materialization inside hot loops and switched to real/imag component handling via reinterpretation.
Decomposed alpha and beta into their real and imaginary parts before entering the main loops.
Eliminated conditionals inside performance-critical loops where possible.
Updated the implementation to use muladd.assign for accumulation.
Refactored indexing and stride handling accordingly.

I’ve also updated and expanded the test cases to ensure full coverage. The numerical results have been cross-verified against SciPy.

Please let me know if there are any remaining performance or style concerns you’d like addressed.

DivitJain26 · 2026-02-16T17:44:07Z

While testing , I noticed that when using a negative strideY, the resulting values appear reversed in the underlying buffer compared to the expected logical order. (eg: fixtures/row_major_xpyn.json)

Numerically the results are correct, but the physical layout differs when strideY < 0.

I wanted to confirm is this behavior expected under the current BLAS semantics ?

kgryte · 2026-02-24T17:51:53Z

+    // Standard usage:
+    > var x = new {{alias:@stdlib/array/complex64}}([ 1.0, 1.0, 2.0, 2.0 ]);
+    > var y = new {{alias:@stdlib/array/complex64}}([ 1.0, 1.0, 2.0, 2.0 ]);
+    > var A = new {{alias:@stdlib/array/complex64}}([ 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 4.0, 4.0 ]);


Suggested change

> var A = new {{alias:@stdlib/array/complex64}}([ 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 4.0, 4.0 ]);

> var buf = [ 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 4.0, 4.0 ];

> var A = new {{alias:@stdlib/array/complex64}}( buf );

kgryte · 2026-02-24T17:53:03Z

+    > var x = new {{alias:@stdlib/array/complex64}}([ 2.0, 2.0, 1.0, 1.0 ]);
+    > var y = new {{alias:@stdlib/array/complex64}}([ 2.0, 2.0, 1.0, 1.0 ]);
+    > var A = new {{alias:@stdlib/array/complex64}}([ 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 4.0, 4.0 ]);
+    > var alpha = new {{alias:@stdlib/complex/float32/ctor}}( 0.5, 0.5 );
+    > var beta = new {{alias:@stdlib/complex/float32/ctor}}( 0.5, -0.5 );
+    > var ord = 'column-major';
+    > var trans = 'no-transpose';


Suggested change

> var x = new {{alias:@stdlib/array/complex64}}([ 2.0, 2.0, 1.0, 1.0 ]);

> var y = new {{alias:@stdlib/array/complex64}}([ 2.0, 2.0, 1.0, 1.0 ]);

> var A = new {{alias:@stdlib/array/complex64}}([ 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 4.0, 4.0 ]);

> var alpha = new {{alias:@stdlib/complex/float32/ctor}}( 0.5, 0.5 );

> var beta = new {{alias:@stdlib/complex/float32/ctor}}( 0.5, -0.5 );

> var ord = 'column-major';

> var trans = 'no-transpose';

> x = new {{alias:@stdlib/array/complex64}}([ 2.0, 2.0, 1.0, 1.0 ]);

> y = new {{alias:@stdlib/array/complex64}}([ 2.0, 2.0, 1.0, 1.0 ]);

> A = new {{alias:@stdlib/array/complex64}}([ 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 4.0, 4.0 ]);

> alpha = new {{alias:@stdlib/complex/float32/ctor}}( 0.5, 0.5 );

> beta = new {{alias:@stdlib/complex/float32/ctor}}( 0.5, -0.5 );

> ord = 'column-major';

> trans = 'no-transpose';

kgryte · 2026-02-24T17:53:55Z

+    Performs one of the matrix-vector operations
+    `y = α*A*x + β*y`,
+    `y = α*A^T*x + β*y`,
+    or `y = α*A^H*x + β*y`
+    using alternative indexing semantics
+    and where `α` and `β` are complex scalars,
+    `x` and `y` are complex vectors,
+    and `A` is an `M` by `N` complex matrix.
+
+    While typed array views mandate a view offset
+    based on the underlying buffer,
+    the offset parameters support indexing semantics
+    based on starting indices.


This all needs to be wrapped at 80 characters. See https://github.com/stdlib-js/stdlib/blob/develop/docs/contributing/repl_text.md

stdlib-bot added the BLAS Issue or pull request related to Basic Linear Algebra Subprograms (BLAS). label Jan 20, 2026

DivitJain26 changed the title ~~Blas/base/cgemv~~ feat: add blas/base/cgemv Jan 30, 2026

kgryte reviewed Feb 2, 2026

View reviewed changes

DivitJain26 requested a review from kgryte February 16, 2026 17:39

stdlib-bot added the Needs Review A pull request which needs code review. label Feb 16, 2026

DivitJain26 marked this pull request as ready for review February 16, 2026 17:40

DivitJain26 mentioned this pull request Feb 24, 2026

Office Hours (2026-02-24) stdlib-js/meetings#85

Closed

kgryte reviewed Feb 24, 2026

View reviewed changes

DivitJain26 closed this Feb 25, 2026

DivitJain26 force-pushed the blas/base/cgemv branch from f4efdb0 to ff420ae Compare February 25, 2026 11:53

stdlib-bot removed the Needs Review A pull request which needs code review. label Feb 25, 2026

DivitJain26 mentioned this pull request Apr 21, 2026

Office Hours (2026-04-21) stdlib-js/meetings#100

Closed

	> var A = new {{alias:@stdlib/array/complex64}}([ 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 4.0, 4.0 ]);
	> var buf = [ 1.0, 1.0, 2.0, 2.0, 3.0, 3.0, 4.0, 4.0 ];
	> var A = new {{alias:@stdlib/array/complex64}}( buf );

Uh oh!

Conversation

DivitJain26 commented Jan 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Questions

Other

Checklist

AI Assistance

Disclosure

Uh oh!

stdlib-bot commented Jan 20, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DivitJain26 Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DivitJain26 Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

kgryte Feb 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

stdlib-bot commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage Report

Uh oh!

DivitJain26 commented Feb 16, 2026

Uh oh!

DivitJain26 commented Feb 16, 2026

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

DivitJain26 commented Jan 20, 2026 •

edited

Loading

DivitJain26 Feb 16, 2026 •

edited

Loading

DivitJain26 Feb 16, 2026 •

edited

Loading

kgryte Feb 2, 2026 •

edited

Loading

stdlib-bot commented Feb 16, 2026 •

edited

Loading