[HN Gopher] LM Studio 0.3 - Discover, download, and run local LLMs
       ___________________________________________________________________
        
       LM Studio 0.3 - Discover, download, and run local LLMs
        
       Author : fdb
       Score  : 195 points
       Date   : 2024-08-22 18:22 UTC (2 days ago)
        
 (HTM) web link (lmstudio.ai)
 (TXT) w3m dump (lmstudio.ai)
        
       | navaed01 wrote:
       | Congrats! I'm a big fan of the existing product and the are some
       | great updates to make the app even more accessible and powerful
        
       | webprofusion wrote:
       | Cool, it's a bit weird that the Windows download is 32-bit, it
       | should be 64-bit by default and there's no need for a 32-bit
       | windows version at all.
        
         | webprofusion wrote:
         | It's probably 64-bit and they just call it x86 on their
         | website. Needs an option to choose where models get downloaded
         | to as your typically C: drive is an SSD with limited space.
        
           | Jedd wrote:
           | Has an option to choose where models get downloaded - in the
           | Models tab you can pick the target path.
        
           | diggan wrote:
           | > Needs an option to choose where models get downloaded to as
           | your typically C: drive is an SSD with limited space.
           | 
           | You can already do this? https://i.imgur.com/BpF3K9t.png
        
       | pcf wrote:
       | In some brief testing, I discovered that the same models (Llama 3
       | 7B and one more I can't remember) are running MUCH slower in LM
       | Studio than in Ollama on my MacBook Air M1 2020.
       | 
       | Has anyone found the same thing, or was that a fluke and I should
       | try LM Studio again?
        
         | christkv wrote:
         | Make sure you turn on the use of the GPU using the slider. By
         | default it does not leverage the full speed.
        
         | smcleod wrote:
         | Don't forget to tune your num_batch
        
         | Terretta wrote:
         | Two replies to parent immediately suggest tuning. Ironically,
         | this release claims to feature auto-config for best
         | performance:
         | 
         |  _"Some of us are well versed in the nitty gritty of LLM load
         | and inference parameters. But many of us, understandably, can
         | 't be bothered. LM Studio 0.3.0 auto-configures everything
         | based on the hardware you are running it on."_
         | 
         | So parent should expect it to work.
         | 
         | I find the same issue: using a MBP with 96GB (M2 Max with
         | 38-core GPU), it seems to tune by default for a base machine.
        
         | viccis wrote:
         | Just chiming in with others to help out:
         | 
         | By default LM Studio doesn't fully use your GPU. I have no idea
         | why. Under the settings pane on the right, turn the slider
         | under "GPU Offload" all the way to 100%.
        
           | cma wrote:
           | Maybe so the web browser etc. still has some GPU without
           | swapping from main memory? What % does it default to?
        
       | alok-g wrote:
       | See also: Msty.app
       | 
       | It allows both local and cloud models.
       | 
       | * Not associated with them in any way. Am a happy user.
        
         | k2enemy wrote:
         | Also jan.ai for offline and online.
        
           | vunderba wrote:
           | +1 for Jan, and unlike Msty/LM Studio - it's open source.
        
       | grigio wrote:
       | can somebody share benchmarks on AMD ryzen AI with and without
       | NPU ?
        
         | Jedd wrote:
         | It's using llama.cpp, so it's going to be the same benchmarks
         | as almost all other apps (given almost everything uses
         | llama.cpp under the hood).
        
       | pornlover wrote:
       | LM Studio is great, although I wish recommended prompts were part
       | of the data of each LLM. I probably just don't know enough but I
       | feel like I get hunk of magic data and then I'm mostly on my own.
       | 
       | Similarly with images, LLMs and ML in general feel like DOS and
       | config.sys and autoexec.bat and qemm days.
        
       | Tepix wrote:
       | Neat! Can i use it with Brave browser's local LLM festure?
        
       | qwertox wrote:
       | Yesterday I wanted to find a conversation snippet in ChatGPT of a
       | conversation I had maybe 1 or 2 weeks ago. Searching for a single
       | keyword would have been enough to find it.
       | 
       | How is it possible that there's still no way to search through
       | your conversations?
        
         | Jedd wrote:
         | Are you complaining about OpenAI's ChatGPT's web UI interface?
        
         | code51 wrote:
         | For Mac and iOS, you can install ChatGPT app.
         | 
         | Why they won't enable search for their main web user crowd is
         | beyond me.
         | 
         | Perhaps they are just afraid of scale. With all their might,
         | it's still possible that they can't estimate the scale and
         | complexity of queries they might receive.
        
           | nilsherzig wrote:
           | They did staged rollouts for almost every recent feature.
           | 
           | I think it might be in their interest if you just ask the LLM
           | again? Old answers might not be up to their current standards
           | and they don't gain feedback from you looking at old answers
        
         | BaculumMeumEst wrote:
         | There are lots of ways to search through your conversations,
         | just not through OpenAI's web interface. If you don't want to
         | explore alternatives because you don't want to lose access to
         | your conversations, I would argue you've just demonstrated to
         | yourself why you should avoid proactively avoid vendor lock-in.
        
         | potatoman22 wrote:
         | Try exporting your data and searching the JSON/HTML.
        
       | smcleod wrote:
       | Nice, it's a solid product! It's just a shame it's not open
       | source and its license doesn't permit work use.
        
         | yags wrote:
         | Thanks! We actually totally permit work use. See
         | https://lmstudio.ai/enterprise.html
        
           | jdboyd wrote:
           | An email us link is a bit discouragement for using it work
           | purposes. I want a clearly defined price list, at least for
           | some entry levels of commercial use.
        
             | fragmede wrote:
             | Or even just a ballpark. Are we talking $500, $5,000,
             | $50,000 or $500,000?
        
           | smcleod wrote:
           | Thanks, what license is it under? This means that anyone that
           | wants to try it at work has to fill that out though right?
        
       | TeMPOraL wrote:
       | Does anyone know if there's a changelog/release notes available
       | for _all_ historical versions of this? This is one of those
       | programs with the annoying habit to surface only the list of
       | changes in the most recent version, and their release cadence is
       | such that there are some 3 to 5 updates between the times I run,
       | and then I have no idea what changed.
        
         | flear wrote:
         | Same. I found their Discord announcement Channel [1] and they
         | may have started to use their blog for a full version changelog
         | [2]
         | 
         | [1] https://discord.gg/aPQfnNkxGC [2] https://lmstudio.ai/blog
        
       | swalsh wrote:
       | I LOVE LM studio, it's super convenient for testing model
       | capabilities, and the OpenAI server makes it really easy to spin
       | up a server and test. My typical process is to load it up in LM
       | studio, test it, and when I'm happy with the settings, move to
       | vllm.
        
       | xeromal wrote:
       | I never could get anything local working a few years ago and
       | someone on reddit told me about LM Studio and I finally managed
       | to "run an AI" on my machine. Really cool and now I'm tinkering
       | with it using the built in HTTP server
        
       | yags wrote:
       | Hello Hacker News, Yagil here- founder and original creator of LM
       | Studio (now built by a team of 6!). I had the initial idea to
       | build LM Studio after seeing the OG LLaMa weights 'leak'
       | (https://github.com/meta-llama/llama/pull/73/files) and then
       | later trying to run some TheBloke quants during the heady early
       | days of ggerganov/llama.cpp. In my notes LM Studio was first
       | "Napster for LLMs" which evolved later to "GarageBand for LLMs".
       | 
       | What LM Studio is today is a an IDE / explorer for local LLMs,
       | with a focus on format universality (e.g. GGUF) and data
       | portability (you can go to file explorer and edit everything).
       | The main aim is to give you an accessible way to work with LLMs
       | and make them useful for your purposes.
       | 
       | Folks point out that the product is not open source. However I
       | think we facilitate distribution and usage of openly available AI
       | and empower many people to partake in it, while protecting (in my
       | mind) the business viability of the company. LM Studio is free
       | for personal experimentation and we ask businesses to get in
       | touch to buy a business license.
       | 
       | At the end of the day LM Studio is intended to be an easy yet
       | powerful tool for doing things with AI without giving up personal
       | sovereignty over your data. Our computers are super capable
       | machines, and everything that can happen locally w/o the
       | internet, should. The app has no telemetry whatsoever (you're
       | welcome to monitor network connections yourself) and it can
       | operate offline after you download or sideload some models.
       | 
       | 0.3.0 is a huge release for us. We added (naive) RAG,
       | internationalization, UI themes, and set up foundations for major
       | releases to come. Everything underneath the UI layer is now built
       | using our SDK which is open source (Apache 2.0):
       | https://github.com/lmstudio-ai/lmstudio.js. Check out specifics
       | under packages/.
       | 
       | Cheers!
       | 
       | -Yagil
        
       | fallinditch wrote:
       | Does anyone know what advantages LM Studio has over Ollama, and
       | vise versa?
        
         | barrkel wrote:
         | Ollama doesn't have a UI.
        
         | vunderba wrote:
         | A better question would be over something like Jan or
         | LibreChat. Ollama's is CLI/API/backend for easily downloading
         | and running models.
         | 
         | https://github.com/janhq/jan
         | 
         | https://github.com/danny-avila/LibreChat
         | 
         | Jan's probably the closest thing to a open-source LLM chat
         | interface that is relatively easy to get started with.
         | 
         | I personally prefer Librechat (which supports integration with
         | image generation) but it does have to spin up some docker stuff
         | and that can make it a bit more complicated.
        
           | himhckr wrote:
           | There is also Msty (https://msty.app), which I find much
           | easier to get started with and it comes with interesting
           | features such as web search, RAG, Delve mode, etc.
        
       | BaculumMeumEst wrote:
       | If you're hopping between these products instead of learning and
       | understanding how inference works under the hood, and
       | familiarizing yourself with the leading open source projects
       | (i.e. llama.cpp), you are doing yourself a great disservice.
        
         | m3kw9 wrote:
         | Why
        
           | washadjeffmad wrote:
           | It's not that high a bar, and we're still very much
           | publication to implementation. Most recently, I was able to
           | use SAM2, SV3D, Mistral NeMo, and Flux.dev day-one, and I'm
           | certainly not some heady software engineer.
           | 
           | There's just a lot of great stuff you're missing out on if
           | you're waiting on products while ignoring the very
           | accessible, freely available tools they're built on top of
           | and often reductions of.
           | 
           | I'm not against overlays like ollama and lm studio, but I
           | feel more confused by why they exist when there's no
           | additional barrier to going on huggingface or using kcpp,
           | ooba, etc.
           | 
           | I just assume it's an awareness issue, but I'm probably
           | wrong.
        
           | ganyu wrote:
           | While it is most proper and convenient to use these out-of-
           | the-box products for fit scenarios,
           | 
           | Doing so will at the very least not help us with our
           | interviews. It will also restrict our mindset of how one can
           | make use of LLMs through the distraction of sleek, heavily
           | abstracted interfaces. This makes it harder, if not
           | impossible for us to come up with bright new ideas that
           | undermine models in various novel ways, which are almost
           | always derived from deep understanding of how things actually
           | work under the hood.
        
         | hnuser123456 wrote:
         | I know how training and inference works under the hood, I know
         | the activation functions and backprop and MMUL, and I know some
         | real applications I really want to build. But there's still
         | plenty of room in the gap between that LM studio helps fill. I
         | also already have software built around the openai api, and the
         | lmstudio openai api emulator is hard to beat for convenience.
         | But if you can outline a process I could follow (or link good
         | literature) to shift towards running LLMs locally with FOSS but
         | still interact with them through an API, I'll absolutely give
         | it a try.
        
           | gastonmorixe wrote:
           | Have you tried Jan? https://github.com/janhq/jan
        
             | hnuser123456 wrote:
             | Fantastic, thank you.
        
           | BaculumMeumEst wrote:
           | "hopping between these products instead of learning and
           | understanding" was intended to exclude people who already
           | know how they work, because I think it is totally fine to use
           | them if you know exactly what all the current knobs and
           | levers do.
        
         | barrkel wrote:
         | Why would someone expect interacting with a local LLM to teach
         | anything about inference?
         | 
         | Interacting with a local LLM develops one's intuitions about
         | how LLMs work, what they're good for (appropriately scaled to
         | model size) and how they break, and gives you ideas about how
         | to use them as a tool in a bigger applications without getting
         | bogged down in API billing etc.
        
           | BaculumMeumEst wrote:
           | Assuming s/would/wouldn't: If you are super smart then
           | perhaps you can intuit details about how they work under the
           | hood. Otherwise you are working with a mental model that is
           | likely to be much more faulty than the one you would develop
           | by learning through study.
        
             | barrkel wrote:
             | Knowing the specific multiplies and QKV and how attention
             | works doesn't develop your intuition for how LLMs work.
             | Knowing that the effective output is a list of tokens with
             | associated probabilites is of marginal use. Knowing about
             | rotary position embeddings, temperature, batching, beam
             | search, different techniques for preventing repetition and
             | so on doesn't really develop intuition about behavior, but
             | rather improve the worst cases - babbling repeating
             | nonsense in the absolute worst - but you wouldn't know that
             | at all from first principles without playing with the
             | things.
             | 
             | The truth is that the inference implementation is more like
             | a VM, and the interesting thing is the model, the set of
             | learned weights. It's like a program being executed one
             | token at a time. How that program behaves is the
             | interesting thing. How it degrades. What circumstances it
             | behaves really well in, and its failure modes. That's the
             | thing where you want to be able to switch and swap a dozen
             | models around and get a feel for things, have forking
             | conversations, etc. It's what LM Studio is decent at.
        
       | 2browser wrote:
       | Running this on Windows on an AMD card. Llama 3.1 Instruct 7B
       | runs really well on this if anyone wants to try.
        
       | mythz wrote:
       | Originally started out with LM Studio which was pretty nice but
       | ended up switching to Ollama since I only want to use 1 app to
       | manage all the large model downloads and there are many more
       | tools and plugins that integrate with Ollama, e.g. in IDEs and
       | text editors
        
       | a1o wrote:
       | What is the recommended system settings for this?
        
         | gymbeaux wrote:
         | It depends on the model you run but generally speaking you want
         | an NVIDIA GPU of some substance. I'd say like a 3060 at
         | minimum.
         | 
         | CPU inference is incredibly slow versus my RTX 3090, but
         | technically it will work.
        
       | mark_l_watson wrote:
       | Question for everyone: I am using the MLX version of Flux to
       | generate really good images from text on my M2 Mac, but I don't
       | have an easy setup for doing text + base image to a new image. I
       | want to be able to use base images of my family and put them on
       | Mount Everest, etc.
       | 
       | Does anyone have a recommendation?
       | 
       | For context: I have almost ten years experience with deep
       | learning, but I want something easy to set up in my home M2 Mac,
       | or Google Colab would be OK.
        
         | MacsHeadroom wrote:
         | Try Diffusion Bee's latest release
         | https://github.com/divamgupta/diffusionbee-stable-diffusion-...
        
       | dgreensp wrote:
       | I filed a GitHub issue two weeks ago about a bug that was enough
       | for me to put it down for a bit, and there's been not even a
       | response. Their development velocity seems incredible, though.
       | I'm not sure what to make of it.
        
         | yags wrote:
         | We probably just missed it. Can you please ping me on it?
         | "@yagil" on GitHub
        
       ___________________________________________________________________
       (page generated 2024-08-24 23:01 UTC)