[HN Gopher] LM Studio 0.3 - Discover, download, and run local LLMs
___________________________________________________________________
LM Studio 0.3 - Discover, download, and run local LLMs
Author : fdb
Score : 195 points
Date : 2024-08-22 18:22 UTC (2 days ago)
(HTM) web link (lmstudio.ai)
(TXT) w3m dump (lmstudio.ai)
| navaed01 wrote:
| Congrats! I'm a big fan of the existing product and the are some
| great updates to make the app even more accessible and powerful
| webprofusion wrote:
| Cool, it's a bit weird that the Windows download is 32-bit, it
| should be 64-bit by default and there's no need for a 32-bit
| windows version at all.
| webprofusion wrote:
| It's probably 64-bit and they just call it x86 on their
| website. Needs an option to choose where models get downloaded
| to as your typically C: drive is an SSD with limited space.
| Jedd wrote:
| Has an option to choose where models get downloaded - in the
| Models tab you can pick the target path.
| diggan wrote:
| > Needs an option to choose where models get downloaded to as
| your typically C: drive is an SSD with limited space.
|
| You can already do this? https://i.imgur.com/BpF3K9t.png
| pcf wrote:
| In some brief testing, I discovered that the same models (Llama 3
| 7B and one more I can't remember) are running MUCH slower in LM
| Studio than in Ollama on my MacBook Air M1 2020.
|
| Has anyone found the same thing, or was that a fluke and I should
| try LM Studio again?
| christkv wrote:
| Make sure you turn on the use of the GPU using the slider. By
| default it does not leverage the full speed.
| smcleod wrote:
| Don't forget to tune your num_batch
| Terretta wrote:
| Two replies to parent immediately suggest tuning. Ironically,
| this release claims to feature auto-config for best
| performance:
|
| _"Some of us are well versed in the nitty gritty of LLM load
| and inference parameters. But many of us, understandably, can
| 't be bothered. LM Studio 0.3.0 auto-configures everything
| based on the hardware you are running it on."_
|
| So parent should expect it to work.
|
| I find the same issue: using a MBP with 96GB (M2 Max with
| 38-core GPU), it seems to tune by default for a base machine.
| viccis wrote:
| Just chiming in with others to help out:
|
| By default LM Studio doesn't fully use your GPU. I have no idea
| why. Under the settings pane on the right, turn the slider
| under "GPU Offload" all the way to 100%.
| cma wrote:
| Maybe so the web browser etc. still has some GPU without
| swapping from main memory? What % does it default to?
| alok-g wrote:
| See also: Msty.app
|
| It allows both local and cloud models.
|
| * Not associated with them in any way. Am a happy user.
| k2enemy wrote:
| Also jan.ai for offline and online.
| vunderba wrote:
| +1 for Jan, and unlike Msty/LM Studio - it's open source.
| grigio wrote:
| can somebody share benchmarks on AMD ryzen AI with and without
| NPU ?
| Jedd wrote:
| It's using llama.cpp, so it's going to be the same benchmarks
| as almost all other apps (given almost everything uses
| llama.cpp under the hood).
| pornlover wrote:
| LM Studio is great, although I wish recommended prompts were part
| of the data of each LLM. I probably just don't know enough but I
| feel like I get hunk of magic data and then I'm mostly on my own.
|
| Similarly with images, LLMs and ML in general feel like DOS and
| config.sys and autoexec.bat and qemm days.
| Tepix wrote:
| Neat! Can i use it with Brave browser's local LLM festure?
| qwertox wrote:
| Yesterday I wanted to find a conversation snippet in ChatGPT of a
| conversation I had maybe 1 or 2 weeks ago. Searching for a single
| keyword would have been enough to find it.
|
| How is it possible that there's still no way to search through
| your conversations?
| Jedd wrote:
| Are you complaining about OpenAI's ChatGPT's web UI interface?
| code51 wrote:
| For Mac and iOS, you can install ChatGPT app.
|
| Why they won't enable search for their main web user crowd is
| beyond me.
|
| Perhaps they are just afraid of scale. With all their might,
| it's still possible that they can't estimate the scale and
| complexity of queries they might receive.
| nilsherzig wrote:
| They did staged rollouts for almost every recent feature.
|
| I think it might be in their interest if you just ask the LLM
| again? Old answers might not be up to their current standards
| and they don't gain feedback from you looking at old answers
| BaculumMeumEst wrote:
| There are lots of ways to search through your conversations,
| just not through OpenAI's web interface. If you don't want to
| explore alternatives because you don't want to lose access to
| your conversations, I would argue you've just demonstrated to
| yourself why you should avoid proactively avoid vendor lock-in.
| potatoman22 wrote:
| Try exporting your data and searching the JSON/HTML.
| smcleod wrote:
| Nice, it's a solid product! It's just a shame it's not open
| source and its license doesn't permit work use.
| yags wrote:
| Thanks! We actually totally permit work use. See
| https://lmstudio.ai/enterprise.html
| jdboyd wrote:
| An email us link is a bit discouragement for using it work
| purposes. I want a clearly defined price list, at least for
| some entry levels of commercial use.
| fragmede wrote:
| Or even just a ballpark. Are we talking $500, $5,000,
| $50,000 or $500,000?
| smcleod wrote:
| Thanks, what license is it under? This means that anyone that
| wants to try it at work has to fill that out though right?
| TeMPOraL wrote:
| Does anyone know if there's a changelog/release notes available
| for _all_ historical versions of this? This is one of those
| programs with the annoying habit to surface only the list of
| changes in the most recent version, and their release cadence is
| such that there are some 3 to 5 updates between the times I run,
| and then I have no idea what changed.
| flear wrote:
| Same. I found their Discord announcement Channel [1] and they
| may have started to use their blog for a full version changelog
| [2]
|
| [1] https://discord.gg/aPQfnNkxGC [2] https://lmstudio.ai/blog
| swalsh wrote:
| I LOVE LM studio, it's super convenient for testing model
| capabilities, and the OpenAI server makes it really easy to spin
| up a server and test. My typical process is to load it up in LM
| studio, test it, and when I'm happy with the settings, move to
| vllm.
| xeromal wrote:
| I never could get anything local working a few years ago and
| someone on reddit told me about LM Studio and I finally managed
| to "run an AI" on my machine. Really cool and now I'm tinkering
| with it using the built in HTTP server
| yags wrote:
| Hello Hacker News, Yagil here- founder and original creator of LM
| Studio (now built by a team of 6!). I had the initial idea to
| build LM Studio after seeing the OG LLaMa weights 'leak'
| (https://github.com/meta-llama/llama/pull/73/files) and then
| later trying to run some TheBloke quants during the heady early
| days of ggerganov/llama.cpp. In my notes LM Studio was first
| "Napster for LLMs" which evolved later to "GarageBand for LLMs".
|
| What LM Studio is today is a an IDE / explorer for local LLMs,
| with a focus on format universality (e.g. GGUF) and data
| portability (you can go to file explorer and edit everything).
| The main aim is to give you an accessible way to work with LLMs
| and make them useful for your purposes.
|
| Folks point out that the product is not open source. However I
| think we facilitate distribution and usage of openly available AI
| and empower many people to partake in it, while protecting (in my
| mind) the business viability of the company. LM Studio is free
| for personal experimentation and we ask businesses to get in
| touch to buy a business license.
|
| At the end of the day LM Studio is intended to be an easy yet
| powerful tool for doing things with AI without giving up personal
| sovereignty over your data. Our computers are super capable
| machines, and everything that can happen locally w/o the
| internet, should. The app has no telemetry whatsoever (you're
| welcome to monitor network connections yourself) and it can
| operate offline after you download or sideload some models.
|
| 0.3.0 is a huge release for us. We added (naive) RAG,
| internationalization, UI themes, and set up foundations for major
| releases to come. Everything underneath the UI layer is now built
| using our SDK which is open source (Apache 2.0):
| https://github.com/lmstudio-ai/lmstudio.js. Check out specifics
| under packages/.
|
| Cheers!
|
| -Yagil
| fallinditch wrote:
| Does anyone know what advantages LM Studio has over Ollama, and
| vise versa?
| barrkel wrote:
| Ollama doesn't have a UI.
| vunderba wrote:
| A better question would be over something like Jan or
| LibreChat. Ollama's is CLI/API/backend for easily downloading
| and running models.
|
| https://github.com/janhq/jan
|
| https://github.com/danny-avila/LibreChat
|
| Jan's probably the closest thing to a open-source LLM chat
| interface that is relatively easy to get started with.
|
| I personally prefer Librechat (which supports integration with
| image generation) but it does have to spin up some docker stuff
| and that can make it a bit more complicated.
| himhckr wrote:
| There is also Msty (https://msty.app), which I find much
| easier to get started with and it comes with interesting
| features such as web search, RAG, Delve mode, etc.
| BaculumMeumEst wrote:
| If you're hopping between these products instead of learning and
| understanding how inference works under the hood, and
| familiarizing yourself with the leading open source projects
| (i.e. llama.cpp), you are doing yourself a great disservice.
| m3kw9 wrote:
| Why
| washadjeffmad wrote:
| It's not that high a bar, and we're still very much
| publication to implementation. Most recently, I was able to
| use SAM2, SV3D, Mistral NeMo, and Flux.dev day-one, and I'm
| certainly not some heady software engineer.
|
| There's just a lot of great stuff you're missing out on if
| you're waiting on products while ignoring the very
| accessible, freely available tools they're built on top of
| and often reductions of.
|
| I'm not against overlays like ollama and lm studio, but I
| feel more confused by why they exist when there's no
| additional barrier to going on huggingface or using kcpp,
| ooba, etc.
|
| I just assume it's an awareness issue, but I'm probably
| wrong.
| ganyu wrote:
| While it is most proper and convenient to use these out-of-
| the-box products for fit scenarios,
|
| Doing so will at the very least not help us with our
| interviews. It will also restrict our mindset of how one can
| make use of LLMs through the distraction of sleek, heavily
| abstracted interfaces. This makes it harder, if not
| impossible for us to come up with bright new ideas that
| undermine models in various novel ways, which are almost
| always derived from deep understanding of how things actually
| work under the hood.
| hnuser123456 wrote:
| I know how training and inference works under the hood, I know
| the activation functions and backprop and MMUL, and I know some
| real applications I really want to build. But there's still
| plenty of room in the gap between that LM studio helps fill. I
| also already have software built around the openai api, and the
| lmstudio openai api emulator is hard to beat for convenience.
| But if you can outline a process I could follow (or link good
| literature) to shift towards running LLMs locally with FOSS but
| still interact with them through an API, I'll absolutely give
| it a try.
| gastonmorixe wrote:
| Have you tried Jan? https://github.com/janhq/jan
| hnuser123456 wrote:
| Fantastic, thank you.
| BaculumMeumEst wrote:
| "hopping between these products instead of learning and
| understanding" was intended to exclude people who already
| know how they work, because I think it is totally fine to use
| them if you know exactly what all the current knobs and
| levers do.
| barrkel wrote:
| Why would someone expect interacting with a local LLM to teach
| anything about inference?
|
| Interacting with a local LLM develops one's intuitions about
| how LLMs work, what they're good for (appropriately scaled to
| model size) and how they break, and gives you ideas about how
| to use them as a tool in a bigger applications without getting
| bogged down in API billing etc.
| BaculumMeumEst wrote:
| Assuming s/would/wouldn't: If you are super smart then
| perhaps you can intuit details about how they work under the
| hood. Otherwise you are working with a mental model that is
| likely to be much more faulty than the one you would develop
| by learning through study.
| barrkel wrote:
| Knowing the specific multiplies and QKV and how attention
| works doesn't develop your intuition for how LLMs work.
| Knowing that the effective output is a list of tokens with
| associated probabilites is of marginal use. Knowing about
| rotary position embeddings, temperature, batching, beam
| search, different techniques for preventing repetition and
| so on doesn't really develop intuition about behavior, but
| rather improve the worst cases - babbling repeating
| nonsense in the absolute worst - but you wouldn't know that
| at all from first principles without playing with the
| things.
|
| The truth is that the inference implementation is more like
| a VM, and the interesting thing is the model, the set of
| learned weights. It's like a program being executed one
| token at a time. How that program behaves is the
| interesting thing. How it degrades. What circumstances it
| behaves really well in, and its failure modes. That's the
| thing where you want to be able to switch and swap a dozen
| models around and get a feel for things, have forking
| conversations, etc. It's what LM Studio is decent at.
| 2browser wrote:
| Running this on Windows on an AMD card. Llama 3.1 Instruct 7B
| runs really well on this if anyone wants to try.
| mythz wrote:
| Originally started out with LM Studio which was pretty nice but
| ended up switching to Ollama since I only want to use 1 app to
| manage all the large model downloads and there are many more
| tools and plugins that integrate with Ollama, e.g. in IDEs and
| text editors
| a1o wrote:
| What is the recommended system settings for this?
| gymbeaux wrote:
| It depends on the model you run but generally speaking you want
| an NVIDIA GPU of some substance. I'd say like a 3060 at
| minimum.
|
| CPU inference is incredibly slow versus my RTX 3090, but
| technically it will work.
| mark_l_watson wrote:
| Question for everyone: I am using the MLX version of Flux to
| generate really good images from text on my M2 Mac, but I don't
| have an easy setup for doing text + base image to a new image. I
| want to be able to use base images of my family and put them on
| Mount Everest, etc.
|
| Does anyone have a recommendation?
|
| For context: I have almost ten years experience with deep
| learning, but I want something easy to set up in my home M2 Mac,
| or Google Colab would be OK.
| MacsHeadroom wrote:
| Try Diffusion Bee's latest release
| https://github.com/divamgupta/diffusionbee-stable-diffusion-...
| dgreensp wrote:
| I filed a GitHub issue two weeks ago about a bug that was enough
| for me to put it down for a bit, and there's been not even a
| response. Their development velocity seems incredible, though.
| I'm not sure what to make of it.
| yags wrote:
| We probably just missed it. Can you please ping me on it?
| "@yagil" on GitHub
___________________________________________________________________
(page generated 2024-08-24 23:01 UTC)