[HN Gopher] OpenJourney: Midjourney, but Open Source
       ___________________________________________________________________
        
       OpenJourney: Midjourney, but Open Source
        
       Author : walterbell
       Score  : 585 points
       Date   : 2023-01-25 18:36 UTC (1 days ago)
        
 (HTM) web link (open-journey.github.io)
 (TXT) w3m dump (open-journey.github.io)
        
       | Simon321 wrote:
       | Keep in mind the real Midjourney uses a completely different
       | architecture, this is just a checkpoint for stable diffusion.
        
         | vintermann wrote:
         | Who knows what Midjourney uses. We've got only claims in
         | discords to go by.
         | 
         | My guess is they do internally a slightly more careful and less
         | porn/anime oriented version of what the 4chan/protogen people
         | do. Make lots of fine tuned checkpoints, merge them, fine tune
         | on a selection of outputs from that, merge more, throw away
         | most of it, try again etc. Maybe there are other models in the
         | mix, but I wouldn't bet on it.
        
       | rks404 wrote:
       | noob question - how hard is it to setup and run this on a windows
       | machine? I've had bad luck with python and package management in
       | windows in the past but that was a long time ago.
        
         | jpe90 wrote:
         | If you use the webui it's a single git clone and an optional
         | file edit to set some CLI flags and that's it. You download
         | models and move them to a directory to use them. Recently they
         | introduced a binary release for people that are unfamiliar with
         | git.
        
         | andybak wrote:
         | Yeah - it's a real pain (and I'm a Python dev)
         | 
         | I just use
         | https://softology.pro/tutorials/tensorflow/tensorflow.htm
         | 
         | - A few manual steps but mainly a well tested installed that
         | does it all for you.
        
           | rks404 wrote:
           | thank you, I appreciate the honesty! I checked out the guide,
           | it looks promising and will give it a try for the next system
           | I assemble
        
         | rm999 wrote:
         | It's gotten much easier in the 24 hours because of this binary
         | release of a popular stable diffusion setup+UI:
         | https://github.com/AUTOMATIC1111/stable-diffusion-webui/rele...
         | 
         | (you still need a Nvidia GPU)
         | 
         | Extract the zip file and run the batch file. Find the cptk
         | (checkpoint) file for a model you want. You can find
         | openjourney here:
         | https://huggingface.co/openjourney/openjourney/tree/main. Add
         | it to the model directory.
         | 
         | Then you just need to go to a web browser and you can use the
         | AUTOMATIC1111 webui. More information here:
         | https://github.com/AUTOMATIC1111/stable-diffusion-webui
        
       | haghiri wrote:
       | This was my project, but since @prompthero changed their
       | "midjourney-v4 dreambooth" model's name to openjourney, I changed
       | my model name to "Mann-E" which is accessible here:
       | https://huggingface.co/mann-e/mann-e_4_rev-0-1 (It's only a
       | checkpoint and under development)
        
       | celestialcheese wrote:
       | If anyone wants to try it out without having to build and install
       | the thing - https://replicate.com/prompthero/openjourney
       | 
       | I've been using openjourney (and MJ/SD) quite a bit, and it does
       | generate "better" with "less" compared to standard v1.5, but it's
       | nowhere close to Midjourney v4.
       | 
       | Midjourney is so far ahead in generating "good" images across a
       | wide space of styles and subjects using very little prompting.
       | While SD requires careful negative prompts and extravagant
       | prompting to generate something decent.
       | 
       | Very interested in being wrong about this, there's so much
       | happening with SD that it's hard to keep up with what's working
       | best.
        
         | MintsJohn wrote:
         | I've been thinking that for months, but recently swung towards
         | being more optimistic about SD again, everything midjourney
         | looks midjourney while SD allows you to create images in any
         | style. MJ really needs to get rid of that MJ style, make it
         | optional as it's undeniably pretty, it's just becoming a little
         | much.
         | 
         | But I still feel 2.x is somehow a degradation of 1.x, its hard
         | to get something decent out of it. The custom training/tuning
         | and all is nice (and certainly the top rain to use SD over MJ,
         | many use cases MJ just can't do) but it should not be used as a
         | band-aid for appearantly inherent shortcomings in the new clip
         | layer (I'm assuming this is where the largest difference comes
         | from, since the Unet is trained on largely the same dataset as
         | 1.x).
        
           | ted_bunny wrote:
           | What's SD? No one's said.
        
             | tehsauce wrote:
             | stable diffusion
        
             | agf wrote:
             | Stable Diffusion
             | https://en.wikipedia.org/wiki/Stable_Diffusion
        
           | lobocinza wrote:
           | MJ can emulate a lot of styles
           | 
           | https://github.com/willwulfken/MidJourney-Styles-and-
           | Keyword...
        
           | lobocinza wrote:
           | MJ is easy to get started with and works well out of the box.
           | SD is for those that want to do things that MJ can't like
           | embeddings.
        
           | michaelbrave wrote:
           | I think 2.0 has potential still, it works with textual
           | inversion type models much better, which can kinda play nice
           | with each other, so given enough of those I imagine you can
           | get some cool stuff with it. I've also heard it does negative
           | prompts much better, so those are less optional in 2.0
           | 
           | But yeah for now, all my custom models are 1.5 so I've yet to
           | fully upgrade yet, most of the community seems to be doing
           | similar at the moment.
        
           | throwaway675309 wrote:
           | To be fair that's the default style of MJ, you're seeing that
           | a lot because most users don't take the time to add style
           | modifiers to their prompts.
           | 
           | If you add qualifiers such as soft colors, impressionistic,
           | western animation, stencil, etc. you can steer mid journey
           | towards much more personalized styles.
        
           | brianjking wrote:
           | Yeah, a lot of Midjourney images are very clearly Midjourney
           | images. Does Midjourney have inpainting/outpainting yet? I
           | admit it's the offering I've evaluated the least.
           | 
           | Midjourneys upscaled images to their current max offering
           | look fantastic, that's for sure. My wife generates some
           | really great stuff just for fun.
        
             | lobocinza wrote:
             | It has inpainting and scaffolding at least.
        
           | smeagull wrote:
           | SD really shat the bed, and a bunch of projects appear to
           | have stuck with 1.5.
        
         | chamwislothe2nd wrote:
         | Every midjourney image has the same feeling to it. A bit 1950s
         | sci-fi artist. I guess it's just that it all looks airbrushed?
         | I can't put my finger on it.
        
           | cwkoss wrote:
           | Yeah, I think Midjourney makes fewer unsuccessful images, but
           | harder to get images that dont match their particular style.
        
             | TillE wrote:
             | I don't know if that was Midjourney's intent, but it seems
             | like a smart approach. Instead of trying to be everything
             | to everyone and generating quite a lot of ugly garbage, you
             | get consistently good-looking stuff in a certain style. I'm
             | sure it helps their business model.
        
               | another-dave wrote:
               | Feels like it's the Instagram model for prompt-generated
               | images.
               | 
               | Anyone can get a camera phone, take a picture and use
               | some free software (e.g. gimp) to get great results in
               | post-processing.
               | 
               | Most non-expert users though want to click on a few pre-
               | defined filters, find one they like & run with it, rather
               | than having more control yet poorer results (precisely
               | because they _aren't_ experts).
        
           | lobocinza wrote:
           | I've played a lot with it lately and that just not true. If
           | you play with styles, colors, angles, views you have a lot of
           | control about how the imagine will look. It can emulate
           | pretty much all mainstream aesthetics.
        
           | IshKebab wrote:
           | It's the science magazine article illustration look.
        
             | brianl047 wrote:
             | Sounds great
             | 
             | If Midjourney applies this to all their artwork then maybe
             | it alleviates some of the ethical concerns (Midjourney then
             | has a "style" independent of the training data)
        
               | VulgarExigency wrote:
               | But the style isn't independent of training data. If you
               | don't feed Midjourney images in that style, it's not
               | going to come up with it independently.
        
         | ImprobableTruth wrote:
         | I think it's down to having a lot of feedback data due to being
         | a service, SD has its aesthetics ratings, but I assume it pales
         | in comparison.
        
       | whitten wrote:
       | Maybe this is an obvious question but if you generate pictures
       | using any of these tools, can you create the same
       | picture/character/person with different poses, or backgrounds,
       | such as for telling a story, and/or creating a comic book, or
       | would you get a new picture every time, such as for the cover of
       | a magazine ?
       | 
       | How reproducible would the pictures be ?
        
         | Narciss wrote:
         | Yes, you can create an AI model based on a few pictures of the
         | "model" (the model can also be AI generated) and then you can
         | generate images of all kinds with that model included.
         | 
         | Check out this video from prompt muse as an example:
         | https://youtu.be/XjObqq6we4U
        
       | sophrocyne wrote:
       | Hey all - InvokeAI maintainer here. A few folks mentioned us in
       | other comments, so posting a few steps to try out this model
       | locally.
       | 
       | Our Repo: https://github.com/invoke-ai/InvokeAI
       | 
       | You will need one of the following:                   An NVIDIA-
       | based graphics card with 4 GB or more VRAM memory.         An
       | Apple computer with an M1 chip.
       | 
       | Installation Instructions: https://invoke-
       | ai.github.io/InvokeAI/installation/
       | 
       | Download the model from Huggingface, add it through our Model
       | Mgmt UI, and then start prompting.
       | 
       | Discord: https://discord.gg/invokeai-the-stable-diffusion-
       | toolkit-102...
       | 
       | Also, will plug we're actively looking for people who want to
       | contribute to our project! Hope you enjoy using the tool.
        
         | d3ckard wrote:
         | Out of curiosity, will M2s work out of the box?
        
           | sophrocyne wrote:
           | Ought to! There are some enhancements coming down the pipe
           | for Macs w/ CoreML, so while they won't be as fast as having
           | a higher end NVidia, they'll continue to get performance
           | increases, as well.
        
       | moneywoes wrote:
       | Is there a solid comparison of midjourney, stable diffusion,
       | dalle 2
        
         | moffkalast wrote:
         | I've only tried out stable diffusion to any real extent, but
         | seeing what other people have gotten out of the other two I can
         | easily say it's the least performant of the bunch.
        
           | sdenton4 wrote:
           | I would be hesitant to pass judgement if only playing with
           | one. It's easy to compare the deluge you've picked through to
           | other people's best picked cherries...
        
             | moffkalast wrote:
             | Well sure, but after hours and hours of messing with params
             | my cherry picked best cases were still lightyears away from
             | the average Midjourney example. Maybe I'm just bad at it
             | -\\_(tsu)_/-
        
           | d3ckard wrote:
           | I actually got better examples running SD on my M1 MBA than
           | from my mid journey trial.
        
       | 88stacks wrote:
       | I was about to integrate this into https://88stacks.com but it
       | requires a write token to hugging face which makes no sense. It's
       | a model that you download. Why does it need write access to
       | hugging Face!?!
        
         | bootloop wrote:
         | Does it really, have you tried it or do you mean because of the
         | documentation? Just skimmed through the code, haven't really
         | seen anything related to uploading. Might not even be required.
        
       | version_five wrote:
       | The huggingface element of these annoys me. Reading the other
       | comments, this is just a stable diffusion checkpoint, so I should
       | be able to download it and not use the diffusers library or
       | whatever other HF stuff. But it's frustrating that it's tied to a
       | for profit ecosystem like this.
       | 
       | I suppose pytorch is / was Facebook, but if feels more arms
       | length. I don't have to install and run a facebook cli to use it
       | (nobody get any ideas).
       | 
       | You don't need a HF cli, you just need to use git LFS (I believe
       | now part of git) to pull the files off of HF (unfortunately still
       | requiring an account with them). It would be nice to see truly
       | open mirrors for this stuff that don't have to involve any
       | company.
        
         | stainablesteel wrote:
         | i don't think it's at the point where most individuals can
         | financially support the model training, its a company doing all
         | this because it requires the consolidated funds of a business
         | 
         | give it 10 years and this will change
        
           | notpushkin wrote:
           | Maybe crowdfunding is an option today?
        
         | rattt wrote:
         | You don't need a HF account to download the checkpoint, can be
         | downloaded straight from the website/browser, direct url:
         | https://huggingface.co/openjourney/openjourney/resolve/main/...
        
           | version_five wrote:
           | Is it possible to download with curl or git lfs (or other
           | "free" command line tool) with no login? I couldn't find a
           | way to do that with the original sd checkpoints.
        
             | rattt wrote:
             | Yes works with anything now, they removed the manual
             | accepting of the terms and auth requirement some months
             | after release.
        
         | Rastonbury wrote:
         | You can download the checkpoint right from hugging face and
         | diffusers is a library you can use for free, I'm not sure what
         | the issue is here, that people need an account?
        
       | sourabh03agr wrote:
       | Looks good but this works well only for gamey, sci-fi kind of
       | themes. Any suggestions for prompts which can yield interesting
       | flowcharts to explain technical concepts?
        
       | EamonnMR wrote:
       | If it's using a RAIL license isn't it not open source?
        
         | nl wrote:
         | Well Open Source licenses don't make sense for training
         | artifacts for the same reason Creative Commons licenses are
         | used for written and artists "open" works rather than Open
         | Source.
        
         | nickvincent wrote:
         | Yeah, that's a fair critique, I think the short answer is
         | depends who you ask.
         | 
         | See this FAQ here: https://www.licenses.ai/faq-2
         | 
         | Specifically:
         | 
         | Q: "Are OpenRAILs considered open source licenses according to
         | the Open Source Definition? NO."
         | 
         | A: "THESE ARE NOT OPEN SOURCE LICENSES, based on the definition
         | used by Open Source Initiative, because it has some
         | restrictions on the use of the licensed AI artifact.
         | 
         | That said, we consider OpenRAIL licenses to be "open". OpenRAIL
         | enables reuse, distribution, commercialization, and adaptation
         | as long as the artifact is not being applied for use-cases that
         | have been restricted.
         | 
         | Our main aim is not to evangelize what is open and what is not
         | but rather to focus on the intersection between open and
         | responsible licensing."
         | 
         | FWIW, there's a lot of active discussion in this space, and it
         | could be the case that e.g. communities settle on releasing
         | code under OSI-approved licenses and models/artifacts under
         | lowercase "open" but use-restricted licenses.
        
           | kmeisthax wrote:
           | My biggest critique of OpenRAIL is that it's not entirely
           | clear that AI is copyrightable[0] to begin with. Specifically
           | the model weights are just a mechanical derivation of
           | training set data. Putting aside the "does it infringe[1]"
           | question, there is zero creativity in the training process.
           | All the creativity is either in the source images or the
           | training code. AI companies scrape source images off the
           | Internet without permission, so they cannot use the source
           | images to enforce OpenRAIL. And while they would own the
           | training code, _nobody is releasing training code_ [2], so
           | OpenRAIL wouldn't apply there.
           | 
           | So I do not understand how the resulting model weights are a
           | subject of copyright _at all_ , given that the US has firmly
           | rejected the concept of "sweat of the brow" as a
           | copyrightability standard. Maybe in the EU you could claim
           | database rights over the training set you collected. But the
           | US refuses to enforce those either.
           | 
           | [0] I'm not talking about "is AI art copyrightable" - my
           | personal argument would be that the user feeding it prompts
           | or specifying inpainting masks is enough human involvement to
           | make it copyrightable.
           | 
           | The Copyright Office's refusal to register AI-generated works
           | has been, so far, purely limited to people trying to claim
           | Midjourney as a coauthor. They are not looking over your work
           | with a fine-toothed comb and rejecting any submissions that
           | have badly-painted hands.
           | 
           | [1] I personally think AI training is fair use, but a court
           | will need to decide that. Furthermore, fair use training
           | would not include fair use for selling access to the AI or
           | its output.
           | 
           | [2] The few bits of training code I can find are all licensed
           | under OSI/FSF approved licenses or using libraries under such
           | licenses.
        
             | taneq wrote:
             | "Mechanical derivation" is doing a lot of heavy lifting
             | here. What qualifies something as "mechanical"? Any
             | algorithm? Or just digital algorithms? Any process entirely
             | governed by the laws of physics?
        
               | kmeisthax wrote:
               | So, in the US, the bedrock of copyrightability is
               | creativity. The opposite would be what SCOTUS derided as
               | the "sweat of the brow" doctrine, where merely "working
               | hard" would give you copyright over the result. No court
               | in the US will actually accept a sweat of the brow
               | argument, of course, because there's Supreme Court
               | precedent against it.
               | 
               | This is why you can't copyright maps[0], and why scans of
               | public domain artwork are automatically public
               | domain[1][2]. Because there's no creativity in them.
               | 
               | The courts do not oppose the use of algorithms or
               | mechanical tools in art. If I draw something in
               | Photoshop, I still own it. Using, say, a blur or contrast
               | filter does not reduce the creativity of the underlying
               | art, because there's still an artist deciding what
               | filters to use, how to control them, et cetera.
               | 
               | That doesn't apply for AI training. The controls that we
               | do have for AI are hyperparameters and training set data.
               | Hyperparameters are not themselves creative inputs; they
               | are selected by trial and error to get the best result.
               | And training set data _can_ be creative, but the specific
               | AI we are talking about was trained purely on scraped
               | images from the Internet, which the creator does not own.
               | So you have a machine that is being fed no creativity,
               | and thus will produce no creativity, so the courts will
               | reject claims to ownership over it.
               | 
               | [0] Trap streets ARE copyrightable, though. This is why
               | you'll find fake streets that don't exist on your maps
               | sometimes.
               | 
               | [1] https://en.wikipedia.org/wiki/Bridgeman_Art_Library_v
               | ._Corel....
               | 
               | [2] Several museums continue to argue the opposite - i.e.
               | that scanning a public domain work creates a new
               | copyright on the scan. They even tried to harass the
               | Wikimedia Foundation over it: https://en.wikipedia.org/wi
               | ki/National_Portrait_Gallery_and_...
        
             | cwkoss wrote:
             | Is the choice of what to train upon not creative? I feel
             | like it can be.
        
               | kmeisthax wrote:
               | _Possibly_ , but even if that were the case, it would
               | protect NovelAI, not Stability.
               | 
               | The closest analogue I can think of would be copyrighting
               | a Magic: The Gathering deck. Robert Hovden did that[0],
               | and somehow convinced the Copyright Office to go along
               | with it. As far as I can tell this never actually got
               | court-tested, though. You _can_ get a thin copyright on
               | arrangements of other works you don 't own, but a
               | critical wrinkle in that is that an MTG deck is not
               | merely "an arrangement of aesthetically pleasing card
               | art". The cards are picked because of their _gameplay
               | value_ , specifically to min-max a particular win
               | condition. They are not arrangements, but strategies.
               | 
               | Here's the thing: there is no copyright in game rules[1].
               | Those are ideas, which you have to patent[2]. And to the
               | extent that an idea and an expression of that idea are
               | inseparable, the idea part makes the whole
               | uncopyrightable. This is known as the merger doctrine. So
               | you can't copyright an MtG deck that would give you de-
               | facto ownership over a particular game strategy.
               | 
               | So, applying that logic back to the training set, you'd
               | only have ownership insamuch as your training set was
               | selected for a particular artistic result, and not just
               | "reducing the loss function" or "scoring higher on a
               | double-blind image preference test".
               | 
               | As far as I'm aware, there _are_ companies that do
               | creatively select training set inputs; i.e. NovelAI.
               | However, most of the  "generalist" AI art generators,
               | such as Stable Diffusion, Craiyon, or DALL-E, were
               | trained on crawled data without much or any tweaking of
               | the inputs[3]. A lot of them have overfit text prompts,
               | because the people training them didn't even filter for
               | duplicate images. You can also specifically fine-tune an
               | existing model to achieve a particular result, which
               | _would_ be a creative process if you could demonstrate
               | that you picked all the images yourself.
               | 
               | But all of that only applies to the training set list
               | itself; the actual training is still noncreative. The
               | creativity has to flow through to the trained model.
               | There's one problem with that, though: if it turns out
               | that AI training for art generators is _not_ fair use,
               | then your copyright over the model dissolves like cotton
               | candy in water. This is because without a fair use
               | argument, the model is just a derivative work of the
               | training set images, and you _do not own_ unlicensed
               | derivative works[4].
               | 
               | [0] https://pluralistic.net/2021/08/14/angels-and-
               | demons/#owning...
               | 
               | [1] Which is also why Cory Doctorow thinks the D&D OGL
               | (either version) is a water sandwich that just takes away
               | your fair use rights.
               | 
               | [2] WotC _actually did_ patent specific parts of MTG,
               | like turning cards to indicate that they 've been used up
               | that turn.
               | 
               | [3] I may have posted another comment in this thread
               | claiming that training sets are kept hidden. I had a
               | brain fart, they all pull from LAION and Common Crawl.
               | 
               | [4] This is also why people sell T-shirts with stolen
               | fanart on it. The artists who drew the stolen art own
               | nothing and cannot sue. The original creator of that art
               | _can_ sue, but more often than not they don 't.
        
             | nickvincent wrote:
             | This is a great point.
             | 
             | Not a lawyer, but as I understand the most likely way this
             | question will be answered (for practical purposes in the
             | US) is via the ongoing lawsuits against GitHub Copilot and
             | Stable Diffusion and Midjourney.
             | 
             | I personally agree the creativity is in the source images
             | and the training code, but think that unless it is decided
             | that for legal purposes "AI Artifacts" (the files
             | containing model weights, embedding, etc.) are just
             | transformations of training data and therefore content and
             | subject to the same legal standards as content, I see a lot
             | of value in trying to let people license training and code
             | and models separately. And if models are just
             | transformations of content, I expect we can adjust the
             | norms around licensing to achieve similar outcomes (i.e.,
             | trying to balance open sharing with some degree of creator-
             | defined use restriction).
        
               | nl wrote:
               | The co-pilot and Dalle lawsuits aren't about if the
               | training weights file can be copyrighted though (they are
               | about if people's work can be freely used for training).
               | 
               | This is a different issue where the OP is arguing that
               | the weights file is not eligible for copyright in the US.
               | That's an interesting and separate point which I haven't
               | really seen addressed before.
        
               | topynate wrote:
               | The two issues aren't exactly the same but they do seem
               | intimately connected. When you consider what's involved
               | in generating a weights file, it's a mostly mechanical
               | process. You write a model, gather some data, and then
               | train. Maybe the design of the model is patentable, or
               | the model/training code is copyrightable (actually, I'm
               | pretty sure it is), but the training process itself is
               | just the execution of a program on some data. You can
               | argue that what that program is doing is simply compiling
               | a collection of facts, which means you haven't created a
               | derivative work, but in that case the weights file is a
               | database, by definition, so not copyrightable in the US.
               | Or you can argue that the program is a tool which you're
               | using to create a new copyrightable work. But in that
               | case it's probably a _derivative_ work.
        
             | twoodfin wrote:
             | How would you distinguish "just a mechanical derivation of
             | training set data" from compiled binary software? The
             | latter seems also to be a mechanical derivation from the
             | source code, but inherits the same protections under
             | copyright law.
        
               | kmeisthax wrote:
               | Usually binaries are compiled from _your own_ source
               | code. If I took leaked Windows NT kernel source and
               | compiled it myself, I wouldn 't be able to claim
               | ownership over the binaries.
               | 
               | Likewise if I drew my own art and used it as sample data
               | for a completely trained-from-scractch art generator, I
               | _would_ own the result. The key problem is that, because
               | AI companies are _not_ licensing their data, there isn 't
               | any creativity that they own for them to assert copyright
               | over. Even if AI training itself is fair use, they still
               | own nothing.
        
               | taneq wrote:
               | Do artists not own copyright on artwork which comprises
               | other sources (eg. collage, sampled music)? It'd be hard
               | to claim that eg. Daft Punk doesn't own copyright on
               | their music.
               | 
               | (Whether other artists can claim copyright over some
               | recognisable sample is another question.)
        
               | kmeisthax wrote:
               | This is why there's the "thin copyright" doctrine in the
               | US. It comes up often in music cases, since a lot of pop
               | music is trying to do the same thing. You _can_ take a
               | bunch of uncopyrightable elements, mix them together in a
               | creative way, and get copyright over that. But that 's a
               | very "thin" copyright since the creativity is less.
               | 
               | I don't think thin copyright would apply to AI model
               | weights, since those are trained entirely by an automated
               | process. Hyperparameters are selected primarily for
               | functionality and not creative merit. And the actual
               | model architectures themselves would be the subject of
               | _patents_ , not copyright; since they're ideas, not
               | expressions of an idea.
               | 
               | Related note: have we seen someone try to patent-troll AI
               | yet?
        
               | nl wrote:
               | It depends.
               | 
               | The Verve's Richard Ashcroft lost partial copyright and
               | all royalties for "Bitter Sweet Symphony" because a
               | sample from the Rolling Stones wasn't properly cleared:
               | https://en.m.wikipedia.org/wiki/Bitter_Sweet_Symphony
               | 
               | Men at Work lost copyright over their famous "Land Down
               | Under" because it used a tune from "Kookaburra sits in
               | the Old Gum Tree" as an important part of the chorus.
        
             | kaoD wrote:
             | > nobody is releasing training code
             | 
             | Interesting. Why is this happening?
        
           | skybrian wrote:
           | Fair enough. "Source available" would be better than "open
           | source" in this case, to avoid misleading people. (You do
           | want them to read the terms.)
        
             | daveloyall wrote:
             | I'm not familiar with machine learning.
             | 
             | But, I'm familiar with poking around in source code repos!
             | 
             | I found this https://huggingface.co/openjourney/openjourney
             | /blob/main/tex... . It's a giant binary file. A big binary
             | blob.
             | 
             |  _(The format of the blob is python 's "pickle" format: a
             | binary serialization of an in-memory object, used to store
             | an in-memory object and later load it, perhaps on a
             | different machine.)_
             | 
             | But, I did not find any source code for generating that
             | file. Am I missing something?
             | 
             | Shouldn't there at least be a list of input images, etc and
             | some script that uses them to train the model?
        
               | kmeisthax wrote:
               | Hahahahaha you sweet summer child. Training code? For an
               | _art generator_?!
               | 
               | Yeah, no. Nobody in the AI community actually provides
               | training code. If you want to train from scratch you'll
               | need to understand what their model architecture is,
               | collect your own dataset, and write your own training
               | loop.
               | 
               | The closest I've come across is code for training an
               | unconditional U-Net; those just take an image and
               | denoise/draw it. CLIP also has its own training code -
               | though everyone just seems to use OpenAI CLIP[0]. You'll
               | need to figure out how to write a Diffusers pipeline that
               | lets you combine CLIP and a U-Net together, and then
               | alter the U-Net training code to feed CLIP vectors into
               | the model, etc. Stable Diffusion also uses a Variational
               | Autoencoder in front of the U-Net to get higher
               | resolution and training performance, which I've yet to
               | figure out how to train.
               | 
               | The blob you are looking at is the _actual model
               | weights_. For you see, AI is proprietary software 's
               | final form. Software so proprietary that not even the
               | creators are allowed to see the source code. Because
               | there _is no source code_. Just piles and piles of linear
               | algebra, nonlinear activation functions, and calculus.
               | 
               | For the record, I _am_ trying to train-from-scratch an
               | image generator using public domain data sources[1]. It
               | is not going well: after adding more images it seems to
               | have gotten significantly dumber, with or without a from-
               | scratch trained CLIP.
               | 
               | [0] I think Google Imagen is using BERT actually
               | 
               | [1] Specifically, the PD-Art-old-100 category on
               | Wikimedia Commons.
        
               | walterbell wrote:
               | Thanks for educating the masses of machine-unwashed
               | newbies!
        
               | kelipso wrote:
               | Have you looked at LAION-400M? And the OpenCLIP [1]
               | people have replicated CLIP performance using LAION-400M.
               | 
               | [1] https://github.com/mlfoundations/open_clip
        
               | nl wrote:
               | This isn't entirely accurate.
               | 
               | The SD training set is available and the exact settings
               | are described in reasonable details:
               | 
               | > The model is trained from scratch 550k steps at
               | resolution 256x256 on a subset of LAION-5B filtered for
               | explicit pornographic material, using the LAION-NSFW
               | classifier with punsafe=0.1 and an aesthetic score >=
               | 4.5. Then it is further trained for 850k steps at
               | resolution 512x512 on the same dataset on images with
               | resolution >= 512x512.
               | 
               | LAION-5B is available as a list of urls.
        
             | JoshTriplett wrote:
             | Yeah, this should not have a headline of "open source".
             | Really disappointing that this isn't actually open, or even
             | particularly close to being open.
        
           | EamonnMR wrote:
           | Seems like 'the lawyers who made the license' and the OSI
           | might be good authorities on what's open source. I'd love to
           | hear a good FSF rant about RAIL though.
        
             | [deleted]
        
           | dmm wrote:
           | Are ML models even eligible for copyright protection? The
           | code certainly but what about the trained weights?
        
             | charcircuit wrote:
             | My thought is that it is a derivative work from the
             | training data. The creativity comes from what you choose to
             | or not to include.
        
       | titaniumtown wrote:
       | Someone should do this but for chatGPT. massive undertaking
       | though
       | 
       | Edit: https://github.com/LAION-AI/Open-Assistant
        
         | vnjxk wrote:
         | look up "open assistant"
        
           | titaniumtown wrote:
           | oh damn https://github.com/LAION-AI/Open-Assistant
           | 
           | cool stuff, thanks
        
       | shostack wrote:
       | I'm failing to train a model off of this in the Automatic1111
       | webui Dreambooth extension. Training on vanilla 1.5 works fine.
       | It throws a bunch of errors I don't have in front of me on my
       | phone.
       | 
       | Anyone else have similar issues? I loaded it both from a locally
       | downloaded version of the model as well as from inputting in the
       | huggingface path and my token with write (?!?) permissions.
       | 
       | Anyone run into similar issues? Suggestions?
        
       | nagonago wrote:
       | > Also, you can make a carrier! How you may ask? it is easy. In
       | our time, we have a lot of digital asset marketplaces such as NFT
       | marketplaces that you can sell your items and make a carrier.
       | Never underestimate the power open source software provides.
       | 
       | At first I thought this might be a joke site, the poorly written
       | copy reads like a parody.
       | 
       | Also, as others have pointed out, this is basically just yet
       | another Stable Diffusion checkpoint.
        
         | notpushkin wrote:
         | This particular wording sounds like it could be a poor
         | translation from Russian. _Sdelat ' karjeru_ (literally: to
         | make a career) means to make a living doing something, or to
         | succeed in doing some job.
        
       | nickthegreek wrote:
       | This is just a sd checkpoint trained on output of Midjourney. You
       | can load it into a1111 or invokeai for easier usage. If you are
       | looking for new checkpoints, check out the Protogen series though
       | for some really neat stuff.
        
         | pdntspa wrote:
         | I just gave Protogen a spin and the diversity of outputs it
         | gave me was abysmal. Every seed for the same (relatively open-
         | ended) prompt used the same color scheme, had the same framing,
         | and the same composition. Whereas with SD 1.5/2.1, the subject
         | would be placed differently in-frame, color schemes were far
         | more varied, and results were far more interesting
         | compositionally. (This is with identical settings between the
         | two models and a random seed)
         | 
         | So unless you want cliche-as-fuck fantasy and samey waifu
         | material, classic SD seems to do a much better job.
        
           | vintermann wrote:
           | Yes, protogen is based on merging of checkpoints. The
           | checkpoints it's merged from are also mostly based on
           | merging. Tracing the degree of ancestry back to fine tuned
           | models is hard, but there's a ton of booru-tagged anime and
           | porn in there.
           | 
           | If there's one style I dislike more than the bland Midjourney
           | style, it's the super-smooth "realistic" child faces on adult
           | bodies that protogen (and its own many descendants) spit out.
        
         | 152334H wrote:
         | HN is just incredibly bad at figuring out what kind of ML
         | projects are worth getting excited about and what aren't.
         | 
         | MJ v4 doesn't even use Stable Diffusion as a base [0]; a fine-
         | tune of the latter will never come close to achieving what they
         | do.
         | 
         | [0] -
         | https://discord.com/channels/729741769192767510/730095596861...
        
           | kossTKR wrote:
           | It doesn't use stablediffusion?
           | 
           | I thought everything besides dall-e was sd under the hood.
        
             | tsurba wrote:
             | Mj earlier versions were around before SD came out. Before
             | dall-e 2 too, but after 1 IIRC. So I assume they have their
             | own custom setup. Perhaps based on dall-e 1 paper
             | originally (not weights as they were never published) and
             | improved from there.
        
               | kossTKR wrote:
               | Interesting i thought stable diffusion was the only other
               | "big player" besides OpenAI because of the expenses in
               | training and extrapolating from papers / new research.
               | 
               | Is Midjourney heavily funded? Because if they can battle
               | SD why aren't we seeing lots of people doing the same,
               | even in the Open Source space?
        
         | quitit wrote:
         | It's actually worse, because automatic and invoke will let you
         | chain up GANs to fix faces and the like, and both have trivial
         | installation procedures.
         | 
         | This offering is like going back to August 2022.
        
         | rahimnathwani wrote:
         | Do you mean this one?
         | https://huggingface.co/darkstorm2150/Protogen_Infinity_Offic...
         | 
         | On the same topic, is there some sort of 'awesome list' of
         | finetuned SD models? (something better than just browsing
         | https://huggingface.co/models?other=stable-diffusion)
        
           | liuliu wrote:
           | https://civitai.com/
        
             | narrator wrote:
             | Looking at this site, I would argue that the canonical
             | "hello world" of an image diffusion model is a picture of a
             | pretty woman. The canonical "hello world" for community
             | chatbots that can run on a consumer GPU will undoubtedly be
             | an AI girlfriend.
        
               | alephaleph wrote:
               | Lena all over again
        
             | dr_dshiv wrote:
             | Wow. Is there something like this for text models?
        
             | madeofpalk wrote:
             | why are they all big breasted women?
        
             | nickthegreek wrote:
             | Not sure why this is downvoted. Civitai does in fact list a
             | bunch of fine tuned models and can be sorted by highest
             | ranked, liked, downloaded, etc. It is a good resource. Many
             | of the models are also available in the .safetensor format
             | so you dont have to worry about a pickled checkpoint.
        
               | lancesells wrote:
               | I didn't downvote but I have to say the images shown on
               | that page are hilariously juvenile. I was a teenager once
               | so I get it but I'm guessing the content is where the
               | downvotes are coming from?
        
               | CyanBird wrote:
               | "The internet is for Porn! The internet is for Porn! So
               | grab your dick and double click! For Porn! Porn! Porn!"
               | 
               | Apologies for the bad taste, but I simply love that song,
               | an absolute classic
               | 
               | https://youtu.be/j6eFNRKEROw
               | 
               | Anyhow, regarding civai, you can filter out the NSFW
               | models quite easily
               | 
               | Ought be noted that protogen 5.3 even when it is not an
               | explicit porn model, it was trained with explicit
               | models... So it can be... Raucy as well
        
             | rahimnathwani wrote:
             | Thanks.
             | 
             | BTW I love your app! At my desk I use Automatic1111
             | (because I have a decent GPU), but it's so nice to have a
             | lean back experience on my iPad. Also, even my 6yo son can
             | use it, as he doesn't need to manipulate a mouse.
        
           | nickthegreek wrote:
           | Here are the protogen models
           | https://civitai.com/user/darkstorm2150
        
         | throwaway64643 wrote:
         | > This is just a sd checkpoint trained on output of Midjourney
         | 
         | Which is sub-optimal -> bad. You don't want to train on output
         | from an AI because you'll end up with a worse version of
         | whatever that AI is already being bad at (hands, foot, and
         | countless other things). This is the AI feedback loop that
         | people have been talking about.
         | 
         | So instead of figuring out what Midjourney has done to get such
         | good result, people just blatantly straight copied those
         | results and fed them directly into the AI, as true as the art
         | thief stereotype they are.
        
         | Eduard wrote:
         | I didn't understand a single word you said :D
        
           | lxe wrote:
           | sd checkpoint -- stable diffusion checkpoint. a model weights
           | file that was obtained by tuning the stablediffusion weights
           | file using probably something like dreambooth on some number
           | of midjourney-generated images.
           | 
           | a1111 / invokeai -- stable diffusion UI tools
           | 
           | Protogen series -- popular stablediffusion checkpoints you
           | can download so you can generate content in various styles
        
       | KaoruAoiShiho wrote:
       | How is it equivalent, it's not nearly as good. Some transparency
       | about how close it is to MJ would be nice though, because it can
       | still be useful.
        
       | indigodaddy wrote:
       | Looks like I can't use this on M1/2?
        
         | liuliu wrote:
         | This is just openjourney model fine-tuned with Dreambooth. You
         | can use any of these tools: Draw Things, Mochi Diffusion,
         | DiffusionBee, AUTOMATIC1111 UI on M1 / M2 with this model. (I
         | wrote Draw Things).
        
       | vjbknjjvugi wrote:
       | why does this need _write_ permissions on my hf account?
        
         | deathtrader666 wrote:
         | "For using OpenJourney you have to make an account in
         | huggingface and make a token with write permission."
        
           | admax88qqq wrote:
           | But why
        
           | [deleted]
        
       | jfdi wrote:
       | What is web4.0?!
        
       | techlatest_net wrote:
       | Some self promotion. We got Stable Diffusion made available as
       | SaaS on AWS[1] with per minute pricing and the unique thing with
       | our SaaS offering is you can shutdown/restart the SaaS
       | environment yourself . You will get charged on per minute basis
       | only when the environment is running.
       | 
       | Also, if you want to try the SaaS for free, feel free to submit a
       | request using our contact-us form [2]
       | 
       | The Web interface for SD is based on InvokeAI [3]
       | 
       | [1] https://aws.amazon.com/marketplace/pp/prodview-qj2mhlfj7cx42
       | [2] https://saas.techlatest.net/contactus [3]
       | https://github.com/invoke-ai
        
       ___________________________________________________________________
       (page generated 2023-01-26 23:02 UTC)