https://github.com/ggerganov/llama.cpp/pull/613 Skip to content Toggle navigation Sign up * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code + Explore + All features + Documentation + GitHub Skills + Blog * Solutions + For + Enterprise + Teams + Startups + Education + By Solution + CI/CD & Automation + DevOps + DevSecOps + Case Studies + Customer Stories + Resources * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles + Repositories + Topics + Trending + Collections * Pricing [ ] * # In this repository All GitHub | Jump to | * No suggested jump to results * # In this repository All GitHub | Jump to | * # In this user All GitHub | Jump to | * # In this repository All GitHub | Jump to | Sign in Sign up {{ message }} ggerganov / llama.cpp Public * Notifications * Fork 2.3k * Star 16.4k * Code * Issues 117 * Pull requests 20 * Discussions * Actions * Projects 0 * Security * Insights More * Code * Issues * Pull requests * Discussions * Actions * Projects * Security * Insights New issue Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community. Pick a username [ ] Email Address [ ] Password [ ] [ ] Sign up for GitHub By clicking "Sign up for GitHub", you agree to our terms of service and privacy statement. We'll occasionally send you account related emails. Already on GitHub? Sign in to your account Jump to bottom Make loading weights 10-100x faster #613 Merged jart merged 9 commits into ggerganov:master from jart:loader Mar 30, 2023 Merged Make loading weights 10-100x faster #613 jart merged 9 commits into ggerganov:master from jart:loader Mar 30, 2023 +717 -328 Conversation 35 Commits 9 Checks 22 Files changed 11 Conversation This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters. Learn more about bidirectional Unicode characters Show hidden characters jart Copy link Collaborator @jart jart commented Mar 29, 2023 This is a breaking change that's going to give us three benefits: 1. Your inference commands should load 100x faster 2. You may be able to safely load models 2x larger 3. You can run many concurrent inference processes This was accomplished by changing the file format so we can mmap() weights directly into memory without having to read() or copy them thereby ensuring the kernel can make its file cache pages directly accessible to our inference processes; and secondly, that the file cache pages are much less likely to get evicted (which would force loads to hit disk) because they're no longer competing with memory pages that were needlessly created by gigabytes of standard i/o. The new file format supports single-file models like LLaMA 7b, and it also supports multi-file models like LLaMA 13B. Our Python tool now merges the foo.1, foo.2, etc. files back into a single file so that the C++ code which maps it doesn't need to reshape data every time. That's made llama.cpp so much simpler. Much of its load code has now been deleted. Furthermore, this change ensures that tensors are aligned properly on a 32-byte boundary. That opens the door to seeing if we can get additional performance gains on some microprocessors, by using ops that require memory alignment. Lastly note that both POSIX and the Windows platform are supported The issue this PR solves is #91 This PR was written in collaboration with @slaren. This PR is also rebased on PR #586 so please do not squash merge! Use either merge or rebase. Sorry, something went wrong. 37 Sumanai, lofcz, sevenreasons, nicknitewolf, z11h, Piezoid, AMT-dev7, chris-brace, Belluxx, alan-w-255, and 27 more reacted with thumbs up emoji 31 lin72h, sindresorhus, lofcz, FNsi, mateuszmlc, oKatanaaa, z11h, chris-brace, Belluxx, slaff, and 21 more reacted with hooray emoji [?] 52 blackle, lin72h, Technetium1, FNsi, mateuszmlc, lofcz, mqy, jwijffels, z11h, chris-brace, and 42 more reacted with heart emoji 51 Loufe, lin72h, Technetium1, FNsi, mateuszmlc, lofcz, marvinborner, thomasantony, z11h, ryseek, and 41 more reacted with rocket emoji All reactions * 37 reactions * 31 reactions * [?] 52 reactions * 51 reactions slaren added 6 commits March 29, 2023 16:36 @slaren @jart Add mmap support for model files 2a6cef6 @slaren @jart Fix ggml_init_params in quantize a1e0f17 @slaren @jart Make mmap_file static 4ae12d0 @slaren @jart Unmap the file in llama_free 4daaa5e @slaren @jart Always initialize mm_addr and mm_length in llama_model 812cfa1 @slaren @jart Initial windows support (untested) 80c2178 @jart jart added performance Speed related topics breaking change Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility. labels Mar 29, 2023 @jart jart mentioned this pull request Mar 30, 2023 Add support for memory mapping models #586 Closed 4 tasks @luminalle Copy link luminalle commented Mar 30, 2023 Should the other converters also be rewritten to handle this new format? All reactions Sorry, something went wrong. @jart jart force-pushed the loader branch from 69debdf to b806987 Compare March 30, 2023 00:51 @jart Copy link Collaborator Author jart commented Mar 30, 2023 Yes indeed. I just fixed the quantize program. Now I'm hunting down all the tests. 1 z11h reacted with thumbs up emoji All reactions * 1 reaction Sorry, something went wrong. @jart jart force-pushed the loader branch from b806987 to a3307d2 Compare March 30, 2023 01:14 @jart Copy link Collaborator Author jart commented Mar 30, 2023 All tests look green except for a CMake test. For example: https:// github.com/ggerganov/llama.cpp/actions/runs/4559537462/jobs/ 8043597142?pr=613 I'm stumped on this error. I can't figure out where the file models/ggml-vocab.bin comes from. Does anyone know? Could it be a stale cache? All reactions Sorry, something went wrong. @FNsi Copy link FNsi commented Mar 30, 2023 * edited All tests look green except for a CMake test. For example: https: //github.com/ggerganov/llama.cpp/actions/runs/4559537462/jobs/ 8043597142?pr=613 I'm stumped on this error. I can't figure out where the file models/ggml-vocab.bin comes from. Does anyone know? Could it be a stale cache? #355 mentioned "Added ./models/ggml-vocab.bin containing just LLaMA vocab data (used for tests)" 1 jart reacted with thumbs up emoji All reactions * 1 reaction Sorry, something went wrong. bakkot bakkot reviewed Mar 30, 2023 View reviewed changes llama.h @@ -20,7 +20,7 @@ #endif #define LLAMA_FILE_VERSION 1 #define LLAMA_FILE_MAGIC 0x67676d66 // 'ggmf' in hex Copy link @bakkot bakkot Mar 30, 2023 * edited There was a problem hiding this comment. Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. [Choose a reason] Hide comment Nit: why change the magic rather than the version? I assumed the plan was to keep the magic constant forever. If you bump the version instead, old executables will recognize new model files and give a more useful error message. And it's nice to distinguish between "this is definitely a model file for this project, but it's the wrong version" vs "this is some random junk we don't know anything about". (This PR is a very neat bit of engineering; please don't let my nitpick distract from that.) Sorry, something went wrong. 8 sw, z11h, chris-brace, Green-Sky, slaren, IsGarrido, ghetzel, and cyrillkuettel reacted with thumbs up emoji All reactions * 8 reactions Copy link Collaborator @Green-Sky Green-Sky Mar 30, 2023 There was a problem hiding this comment. Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. [Choose a reason] Hide comment not a nitpick but a real change request :) Sorry, something went wrong. All reactions Copy link Collaborator @Green-Sky Green-Sky Mar 30, 2023 There was a problem hiding this comment. Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. [Choose a reason] Hide comment (nvm) Sorry, something went wrong. All reactions @ggerganov Copy link Owner ggerganov commented Mar 30, 2023 * edited @jart The models/ggml-vocab.bin is generated by convert-pth-to-ggml.py by providing an extra arg. I had the expectation that mmap support would be much more intrusive, but in fact it turned out to be very compact. llama.cpp is much more simpler now. Good stuff Regarding the version comment - yes, the plan was to bump versions and no the magic. But I'm ok to change the magic to commemorate the significance of this update. In fact, maybe we can make this a thing and everybody who makes a significant contribution to the project will get their initials appended to the version. What do you think? Let me play with this tonight before merging. We have to make special care that all the other ggml model files floating around (Alpaca, GPT4All, Chinese LLaMA, etc.) have a nice way to convert to this new format and update the instructions in the README. Also, maybe some synchronisation with #545 would be needed 7 KASR, jart, sevenreasons, thomasantony, FNsi, HanClinto, and trholding reacted with thumbs up emoji 2 Green-Sky and trholding reacted with laugh emoji All reactions * 7 reactions * 2 reactions Sorry, something went wrong. @jart Make loading weights 10-100x faster ... 75d1e55 This is a breaking change that's going to give you three benefits: 1. Your inference commands should load 100x faster 2. You may be able to safely load models 2x larger 3. You can run many concurrent inference processes This was accomplished by changing the file format so we can mmap() weights directly into memory without having to read() or copy them thereby ensuring the kernel can make its file cache pages directly accessible to our inference processes; and secondly, that the file cache pages are much less likely to get evicted (which would force loads to hit disk) because they're no longer competing with memory pages that were needlessly created by gigabytes of standard i/o. The new file format supports single-file models like LLaMA 7b, and it also supports multi-file models like LLaMA 13B. Our Python tool now merges the foo.1, foo.2, etc. files back into a single file so that the C++ code which maps it doesn't need to reshape data every time. That's made llama.cpp so much simpler. Much of its load code has now been deleted. Furthermore, this change ensures that tensors are aligned properly on a 32-byte boundary. That opens the door to seeing if we can get additional performance gains on some microprocessors, by using ops that require memory alignment. Lastly note that both POSIX and the Windows platform are supported Fixes ggerganov#91 @jart jart force-pushed the loader branch from a3307d2 to 75d1e55 Compare March 30, 2023 07:09 @jart Copy link Collaborator Author jart commented Mar 30, 2023 File updated. A lot more tests are green now. No idea what's up with the sanitizer. I thought so too! I too was pleasantly surprised by how well it worked out. Glad we took a few weeks to think. I'm honored to hear you say that. I can roundup the magic to 64 bytes if you like, so there's room to hand out kudos without breaking backwards compatibility in the future. Since my initials also act as a stamp of approval, I'm going to be sending a follow-up change after this, that'll harden the loading code, so that folks will be able to trade model files for this format on HuggingFace with maximum safety and confidence. #545 is an ambitious unification. I've done my best to comment my changes to make the merge less painful for the author. I've sought to update the other scripts too, but don't know how to run them. One thing you could also consider with this project is having a contrib/ folder, where folks can merge as much of their own stuff as they want, under the expectation that the ones who need it are the ones who maintain it. 9 FNsi, oKatanaaa, sevenreasons, nicknitewolf, thomasantony, z11h, szopeno, yacineMTB, and pabl-o-ce reacted with thumbs up emoji 2 FNsi and whjms reacted with hooray emoji All reactions * 9 reactions * 2 reactions Sorry, something went wrong. @jart Ensure --mlock works properly with mmap() support a45e843 mqy mqy reviewed Mar 30, 2023 View reviewed changes llama.cpp int fd = open(fname, O_RDONLY); if (fd == -1) return 0; int64_t length = lseek(fd, 0, SEEK_END); void *addr = mmap(NULL, length, PROT_READ, MAP_SHARED, fd, 0); Copy link Contributor @mqy mqy Mar 30, 2023 There was a problem hiding this comment. Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. [Choose a reason] Hide comment 1. Is it more safe to use mmap64 for 4GB+ files? 2. It seems mmap, mmap64 and MapViewOfFile support mapping from given offset. Is it possible to map from header_len (as offset)? If we can do this, no need to align model file, right? Sorry, something went wrong. 1 FNsi reacted with eyes emoji All reactions * 1 reaction Copy link Collaborator Author @jart jart Mar 30, 2023 There was a problem hiding this comment. Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. [Choose a reason] Hide comment 1. The right thing to do on 32-bit platforms is to have your build system define -D_FILE_OFFSET_BITS=64 which will cause your system header files to automatically #define mmap mmap64 2. File offsets passed to mmap() need to be page size aligned, so I don't think so. Sorry, something went wrong. All reactions Copy link @pgoodman pgoodman Mar 31, 2023 * edited There was a problem hiding this comment. Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. [Choose a reason] Hide comment @jart Is it possible to ensure the file size is a multiple of the hugepage size (e.g. using ftruncate), to benefit from fewer TLB lookups when the model data is accessed? (corresponding mmap hints or other system-specific APIs, e.g. needed for macOS, might need to be used) Sorry, something went wrong. All reactions Copy link Collaborator Author @jart jart Mar 31, 2023 There was a problem hiding this comment. Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. [Choose a reason] Hide comment It doesn't matter with mmap() if the file length isn't page size aligned, even with smaller pages. You should be good to go if you modify the mmap() code in llama.cpp by hand and actually manage to get huge pages to work without nuking your machine :-) Sorry, something went wrong. All reactions sw sw reviewed Mar 30, 2023 View reviewed changes convert-pth-to-ggml.py Outdated Show resolved Hide resolved convert-pth-to-ggml.py Outdated Show resolved Hide resolved llama.cpp Show resolved Hide resolved jart added a commit to jart/llama.cpp that referenced this pull request Mar 30, 2023 @jart Introduce GGML migration tool for new file format ... f013c39 If you deleted your old Meta LLaMA .pth files, then the migrate-ggml-2023-03-30-pr613.py script will allow you to convert your old ggml files into the new mmap()'able format. See ggerganov#613 jart added a commit to jart/llama.cpp that referenced this pull request Mar 30, 2023 @jart Introduce GGML migration tool for new file format ... c0f330f If you deleted your old Meta LLaMA .pth files, then the migrate-ggml-2023-03-30-pr613.py script will allow you to convert your old ggml files into the new mmap()'able format. See ggerganov#613 @jart jart force-pushed the loader branch from f013c39 to c0f330f Compare March 30, 2023 12:48 @jart Copy link Collaborator Author jart commented Mar 30, 2023 @ggerganov This change now includes a migration tool named migrate-ggml-2023-03-30-pr613.py. This will ensure that users of the old GGML file format who've deleted the original .pth files, will be able to convert their ggml+ggmf files to the new ggml+ggjt format. Please take a look. [?] 5 snxraven, napiquet, lin72h, blackle, and jakeisnt reacted with heart emoji All reactions * [?] 5 reactions Sorry, something went wrong. @x02Sylvie Copy link x02Sylvie commented Mar 30, 2023 Having issue migrating alpaca model ggml-alpaca-13b-q4.bin, python script seems to think that model has two n_parts rather than one, adding --n_parts argument to conversion script to manually specify --n_parts 1 just like when running alpaca models on llama.cpp might resolve the issue? migrate All reactions Sorry, something went wrong. @jart Copy link Collaborator Author jart commented Mar 30, 2023 @x02Sylvie I don't have access to the Alpaca model. Could send a pull request fixing that after this gets merged? All reactions Sorry, something went wrong. @x02Sylvie Copy link x02Sylvie commented Mar 30, 2023 * edited I don't really know python, so I'd rather leave pull request to someone smarter than me, I did however manage to get alpaca 13b model converted by manually setting n_parts to 1 in .py conversion script . I'm unsure if it's proper place to set n_parts though def get_n_parts(dim): mappings = {4096: 1, 5120: 2, 6656: 4, 8192: 8} n_parts = mappings.get(dim) if n_parts is None: print(f"Invalid dim: {dim}") sys.exit(1) print(f"n_parts = {n_parts}\n") return n_parts to def get_n_parts(dim): mappings = {4096: 1, 5120: 2, 6656: 4, 8192: 8} n_parts = 1 if n_parts is None: print(f"Invalid dim: {dim}") sys.exit(1) print(f"n_parts = {n_parts}\n") return n_parts Model does work however after conversion All reactions Sorry, something went wrong. @jart Copy link Collaborator Author jart commented Mar 30, 2023 Yes, that code which essentially guesses n_parts based off the dimension sizes looks like a LLaMA kludge to me. @ggerganov would need to weigh in before we change it. I was simply cargo culting other parts of the codebase that do this. 3 x02Sylvie, lin72h, and FNsi reacted with hooray emoji All reactions * 3 reactions Sorry, something went wrong. @rabidcopy Copy link Contributor rabidcopy commented Mar 30, 2023 So far the conversion script works with my current ggml models. Even converted the gpt4all model with convert-gpt4all-to-ggml.py and then converted that with migrate-ggml-2023-03-30-pr613.py and it works. Do have to manually set n_parts in the script for my larger models that are in 1 part. But still works nonetheless! All reactions Sorry, something went wrong. jart added a commit to jart/llama.cpp that referenced this pull request Mar 30, 2023 @jart Introduce GGML migration tool for new file format ... adaba69 If you deleted your old Meta LLaMA .pth files, then the migrate-ggml-2023-03-30-pr613.py script will allow you to convert your old ggml files into the new mmap()'able format. See ggerganov#613 @jart jart force-pushed the loader branch from c0f330f to adaba69 Compare March 30, 2023 15:27 @jart Copy link Collaborator Author jart commented Mar 30, 2023 I've just pushed an update to this PR addressing the n_part issue. We're now using this algorithm in the migration tool: # count number of multipart files by convention n_parts = 1 while True: if os.path.exists("%s.%d" % (args.fin_path, n_parts)): n_parts += 1 else: break 2 rabidcopy and ggerganov reacted with thumbs up emoji All reactions * 2 reactions Sorry, something went wrong. @slaren slaren mentioned this pull request Mar 30, 2023 SWAP info added to README #632 Closed ggerganov ggerganov approved these changes Mar 30, 2023 View reviewed changes Copy link Owner @ggerganov ggerganov left a comment There was a problem hiding this comment. Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. [Choose a reason] Hide comment I think this is ready to merge - please go ahead Sorry, something went wrong. 6 jart, ghaerr, Miserlou, FNsi, jakeisnt, and helsont reacted with hooray emoji All reactions * 6 reactions llama.cpp Outdated "\tsee https://github.com/ggerganov/llama.cpp/issues/91\n" "\tuse convert-pth-to-ggml.py to regenerate from original pth\n" "\tuse migrate-ggml-2023-03-30-pr613.py if you deleted originals\n" , path); Copy link Owner @ggerganov ggerganov Mar 30, 2023 There was a problem hiding this comment. Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. [Choose a reason] Hide comment Let's print the magic we got and the magic we expect here to help debug issues Sorry, something went wrong. All reactions Copy link Collaborator Author @jart jart Mar 30, 2023 There was a problem hiding this comment. Choose a reason for hiding this comment The reason will be displayed to describe this comment to others. Learn more. [Choose a reason] Hide comment As you wish. Updated. Sorry, something went wrong. All reactions @CoderRC Copy link CoderRC commented Mar 30, 2023 Successfully compiled master branch and successfully compiled jart's master branch, and successfully run ./main -m ./models/7B/ ggml-model-q4_0.bin -p "Building a website can be done in 10 simple steps:" -t 8 -n 512. In msys2 with mingw32 gcc compiler using: make LDFLAGS='-D_POSIX_MAPPED_FILES -lmingw32_extended' CFLAGS= '-D_POSIX_MAPPED_FILES -I. -O3 -DNDEBUG -std=c11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wdouble-promotion -Wshadow -Wstrict-prototypes -Wpointer-arith -Wno-unused-function -mfma -mf16c -mavx -mavx2' CXXFLAGS='-D_POSIX_MAPPED_FILES -I. -I./examples -O3 -DNDEBUG -std=c++11 -fPIC -Wall -Wextra -Wpedantic -Wcast-qual -Wno-unused-function' If confused how exactly I did compile it, read #103 (comment) All reactions Sorry, something went wrong. @CoderRC Copy link CoderRC commented Mar 30, 2023 This should be ready to merge due to my testing and compiling not failing. [?] 2 jart and snxraven reacted with heart emoji All reactions * [?] 2 reactions Sorry, something went wrong. @jart Introduce GGML migration tool for new file format ... 516474b If you deleted your old Meta LLaMA .pth files, then the migrate-ggml-2023-03-30-pr613.py script will allow you to convert your old ggml files into the new mmap()'able format. See ggerganov#613 @jart jart force-pushed the loader branch from adaba69 to 516474b Compare March 30, 2023 19:18 @jart Copy link Collaborator Author jart commented Mar 30, 2023 All tests that can be green are green. Thread sanitizer failures mentioned earlier look due to ARM NEON changes yesterday. Proceeding with merge as advised. It's really exciting to commit my first official change on this project. Thanks @ggerganov and everyone who helped! 2 ggerganov and FNsi reacted with thumbs up emoji 5 ghaerr, FNsi, oKatanaaa, niltonvasques, and jwgrawe reacted with hooray emoji All reactions * 2 reactions * 5 reactions Sorry, something went wrong. Hide details View details @jart jart merged commit ee0c40d into ggerganov:master Mar 30, 2023 18 of 22 checks passed @ggerganov ggerganov mentioned this pull request Mar 30, 2023 parallelize the quantization process #581 Closed @prusnak prusnak mentioned this pull request Mar 30, 2023 drop quantize.py (now that models are using a single file) #640 Merged @gaceladri Copy link gaceladri commented Mar 31, 2023 Hello, I can not load the gtp4all after converting it to the new ggml format using your script: python3 convert-gpt4all-to-ggml.py models/gpt4all/ gpt4all-lora-quantized.bin ./models/tokenizer.model I have opened a new issue probably related to this: #655 (comment) All reactions Sorry, something went wrong. @gaceladri Copy link gaceladri commented Mar 31, 2023 I could run it with the previous version https://github.com/ggerganov /llama.cpp/tree/master-ed3c680 Hello, I can not load the gtp4all after converting it to the new ggml format using your script: python3 convert-gpt4all-to-ggml.py models/gpt4all/gpt4all-lora-quantized.bin ./models/ tokenizer.model I have opened a new issue probably related to this: #655 (comment) All reactions Sorry, something went wrong. @rabidcopy Copy link Contributor rabidcopy commented Mar 31, 2023 * edited Hello, I can not load the gtp4all after converting it to the new ggml format using your script: python3 convert-gpt4all-to-ggml.py models/gpt4all/gpt4all-lora-quantized.bin ./models/ tokenizer.model I have opened a new issue probably related to this: #655 (comment) You need to also run the resulting file through migrate-ggml-2023-03-30-pr613.py as well. gpt4all weights -> convert-gpt4all-to-ggml.py -> converted gpt4all weights -> migrate-ggml-2023-03-30-pr613.py -> gpt4all weights compatible with the latest version of llama.cpp 5 gaceladri, jart, jeffmcjunkin, mol4711, and dweekly reacted with thumbs up emoji All reactions * 5 reactions Sorry, something went wrong. @gaceladri Copy link gaceladri commented Mar 31, 2023 It worked. Thank you for your fast response! 2 jart and pablodz reacted with laugh emoji All reactions * 2 reactions Sorry, something went wrong. Nuked88 pushed a commit to Nuked88/llama.http that referenced this pull request Mar 31, 2023 @jart @Nuked88 Introduce GGML migration tool for new file format ... 5c8b15d If you deleted your old Meta LLaMA .pth files, then the migrate-ggml-2023-03-30-pr613.py script will allow you to convert your old ggml files into the new mmap()'able format. See ggerganov#613 @edwios edwios mentioned this pull request Mar 31, 2023 Failed to load llama model ggerganov/whisper.cpp#702 Open Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment Reviewers @sw sw sw left review comments @pgoodman pgoodman pgoodman left review comments @mqy mqy mqy left review comments @bakkot bakkot bakkot left review comments @Green-Sky Green-Sky Green-Sky left review comments @ggerganov ggerganov ggerganov approved these changes Assignees No one assigned Labels breaking change Changes that break ABIs, APIs, file formats, or other forms of backwards compatibility. performance Speed related topics Projects None yet Milestone No milestone Development Successfully merging this pull request may close these issues. None yet 14 participants @jart @luminalle @FNsi @ggerganov @x02Sylvie @rabidcopy @CoderRC @gaceladri @sw @pgoodman @mqy @bakkot @Green-Sky @slaren Add this suggestion to a batch that can be applied as a single commit. This suggestion is invalid because no changes were made to the code. Suggestions cannot be applied while the pull request is closed. Suggestions cannot be applied while viewing a subset of changes. Only one suggestion per line can be applied in a batch. Add this suggestion to a batch that can be applied as a single commit. Applying suggestions on deleted lines is not supported. You must change the existing code in this line in order to create a valid suggestion. Outdated suggestions cannot be applied. This suggestion has been applied or marked resolved. Suggestions cannot be applied from pending reviews. Suggestions cannot be applied on multi-line comments. Suggestions cannot be applied while the pull request is queued to merge. Footer (c) 2023 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time. You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.