-
Notifications
You must be signed in to change notification settings - Fork 2.3k
Open
Description
On Apple M2 silicon it seems that the BitNet code is triggering a LLVM optimization bug. Others have reported this as the build running for hours: issue #251, #260, etc. I have observed this with Apple clang version 17.0.0 and Homebrew clang version 20.1.6. The issue is with InterleavedLoadCombine optimization pass and the specific vector shuffle patterns in the BitNet code. The same infinite recursion is happening with VectorInfo::computeFromSVI calling itself recursively. The fix for me was to disable the problematic optimization. With optimization disabled I was able to compile llama-cli and run inference.
diff setup_env.py setup_env.py.orig
214,227c214
<
< # Add LLVM optimization flags to work around the interleaved load combine bug
< llvm_fix_flags = "-O2 -mllvm -disable-interleaved-load-combine"
<
< run_command([
< "cmake", "-B", "build",
< *COMPILER_EXTRA_ARGS[arch],
< *OS_EXTRA_ARGS.get(platform.system(), []),
< "-DCMAKE_C_COMPILER=clang",
< "-DCMAKE_CXX_COMPILER=clang++",
< f"-DCMAKE_CXX_FLAGS={llvm_fix_flags}",
< f"-DCMAKE_C_FLAGS={llvm_fix_flags}"
< ], log_step="generate_build_files")
<
---
> run_command(["cmake", "-B", "build", *COMPILER_EXTRA_ARGS[arch], *OS_EXTRA_ARGS.get(platform.system(), []), "-DCMAKE_C_COMPILER=clang", "-DCMAKE_CXX_COMPILER=clang++"], log_step="generate_build_files")
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels