Commit graph

29 commits

Author SHA1 Message Date
Kovid Goyal
237bb35ee9
More CodeQL fixes 2025-04-20 21:53:11 +05:30
Kovid Goyal
32f0da2e77
Ensure no frame is created for assembly functions 2024-03-15 07:58:09 +05:30
Kovid Goyal
47fea26b62
Add an IndexByte implementation useful for benchmarking against stdlib SIMD implementation 2024-03-07 09:36:40 +05:30
Kovid Goyal
a7c06b38e6
We dont actually need vzeroupper at start of function
GCC emits vzeroupper automatically when compiling with native
optimizations but we still need it otherwise
2024-02-25 09:57:43 +05:30
Kovid Goyal
720618bc37
Use go 1.22 for building
It supports PCALIGN on non ARM arches as well
2024-02-25 09:57:43 +05:30
Kovid Goyal
c01b959723
Fix Go unaligned index implementation 2024-02-25 09:57:42 +05:30
Kovid Goyal
bbdb0b15f3
DRYer 2024-02-25 09:57:42 +05:30
Kovid Goyal
b5edd9ad57
Dont precalculate mask in loop body
No need since we dont shift. Avoids the extra mask instructions for the
not found case.
2024-02-25 09:57:42 +05:30
Kovid Goyal
f9fd6ffd46
Use only aligned loads for index funcs
Also obviates the necessity for safe slice wrappers
2024-02-25 09:57:41 +05:30
Kovid Goyal
31a5fcf297
DRYer 2024-02-25 09:57:41 +05:30
Kovid Goyal
561712090d
Fix cmplt implementation 2024-02-25 09:57:41 +05:30
Kovid Goyal
57f4ea4d4a
Add some tests for broadcast from constant intrinsic 2024-02-25 09:57:41 +05:30
Kovid Goyal
9b0ae8d403
Dont use VEX encoded instructions for 128 bit ISA 2024-02-25 09:57:41 +05:30
Kovid Goyal
aed0611fb8
Avoid double trailing RET 2024-02-25 09:57:40 +05:30
Kovid Goyal
5a5e31c38b
Also zero upper at start of function 2024-02-25 09:57:40 +05:30
Kovid Goyal
db2e0e816d
Fix mixing of register types in the same function 2024-02-25 09:57:40 +05:30
Kovid Goyal
a298781b85
DRYer 2024-02-25 09:57:40 +05:30
Kovid Goyal
d5cd9ef2ca
... 2024-02-25 09:57:40 +05:30
Kovid Goyal
da31db3212
... 2024-02-25 09:57:40 +05:30
Kovid Goyal
601c4ad4df
Fix some typos 2024-02-25 09:57:40 +05:30
Kovid Goyal
68d800d4fa
make clean should clean generated asm as well 2024-02-25 09:57:40 +05:30
Kovid Goyal
9fc3db1dd1
Work on C0 index func 2024-02-25 09:57:40 +05:30
Kovid Goyal
161eae78b6
Make generated asm_* files world readable 2024-02-25 09:57:40 +05:30
Kovid Goyal
77cfd44f24
More efficient clearing of register to all zeros or all ones 2024-02-25 09:57:39 +05:30
Kovid Goyal
59be7213cf
Make set1_epi8 more general 2024-02-25 09:57:39 +05:30
Kovid Goyal
d60dacbd09
Implement > and < intrinsics for vector registers 2024-02-25 09:57:39 +05:30
Kovid Goyal
82b7b4fcce
Make a re-useable template for generating ASM index functions with different tests 2024-02-25 09:57:39 +05:30
Kovid Goyal
4e6138d785
Generate SIMD code during build 2024-02-25 09:57:39 +05:30
Kovid Goyal
de8c1e0206
Work on porting SIMD vt arser to Go for the kittens 2024-02-25 09:57:39 +05:30