[HN Gopher] Show HN: I made an open source and local translation...
___________________________________________________________________
Show HN: I made an open source and local translation app
A few years ago, right after high school, I decided to try to make
a simultaneous translation app for Android as a side project, it
took longer than expected (about 2 years) and I had to make a lot
of compromises (I had to use Google's API and therefore make users
use a developer key because at the time there were no free
solutions for speech recognition and translation that had good
quality). At the end of university, I decided to pick it up again
and finally, using OpenAi's Whisper for speech recognition and
Meta's NLLB for translation (with both running locally on the
phone), I managed to make it free and totally open-source (as it
was meant to be from the beginning). The app is still in beta, so I
would love your feedback.
Author : niedev
Score : 189 points
Date : 2024-06-18 21:26 UTC (1 days ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| owenpalmer wrote:
| The UI is nice and clean, but I don't have a secondary device to
| test out the conversation mode. A video demo of how this works
| would be nice.
|
| The first translation I tried was incorrect:
|
| "Bonjour" (french) -> "- How are you ?" (english)
|
| I suppose that's the limitation of local models.
|
| Great job getting it to a functional beta though. I think the
| idea is cool, it just needs some polishing.
| niedev wrote:
| Yeah, for short inputs the models are not that precise, but for
| long and complicated ones they work quite good, probably NLLB
| has been trained with very few short texts.
| tamimio wrote:
| As said, a video demo would be great, especially since it is not
| on iOS yet, but it looks great!
| niedev wrote:
| Thank you. I'll probably make one these days.
| mrwyz wrote:
| The app looks very cool, but you might have a licensing issue;
| Meta's NLLB models are CC-BY-NC 4.0 (non-commercial use only). I
| could recommend checking out the OPUS-NLP models, which are truly
| open.
| Flimm wrote:
| In that case, it is not open source. Part 6 of the open source
| definition by OSI (the only commonly accepted definition)
| includes this requirement:
|
| > 6. No Discrimination Against Fields of Endeavor
|
| >
|
| > The license must not restrict anyone from making use of the
| program in a specific field of endeavor. For example, it may
| not restrict the program from being used in a business, or from
| being used for genetic research.
|
| This app looks really cool and I'm glad OP made it. I'm just
| asking that the inaccurate "open source" descriptor be removed.
| Don't let my comment discourage you.
| niedev wrote:
| Meta defines NLLB as open source, but assuming it isn't (I
| know Meta It's not a company that has a problem with lying),
| my code is open source, so how should I define my app? I
| personally make a distinction between open source and
| completely open source (or 100% open source), because
| otherwise there is no intermediate definition, according to
| the OSI definition my app is not open source, but it is not
| closed source either, OSI does it have an intermediate
| definition?
| badLiveware wrote:
| Source-available[1] is likely the closest commonly used
| term for it
|
| [1] https://en.wikipedia.org/wiki/Source-available_software
| niedev wrote:
| it's the closest, but it's a very broad and not so well
| known term, maybe defining my app open source is not
| technically a perfect term, but it's the clearest in
| describing it (considering that in the libraries and
| models section I specified which are open-source and
| which are not, Ml-Kit, used to recognize the language,
| for example is closed source), if a developer doesn't see
| the open source writing anywhere he will only get
| confused in understanding what the license of my app is.
| However, based on your feedback I added the specification
| (in the "libraries and models" section) that NLLB has a
| non-commercial license.
| regen7253 wrote:
| Instead of "almost", maybe "Open Source Code (and free
| for non-commercial use due to model licencing)" since
| your contributions are free as in freedom,and that is
| amazing!
|
| Thank you for building this, I have been using a web
| interface connected to a local server for inference but
| the latency was about 1 second, too much for my taste!
| niedev wrote:
| I think it is too long for the preface, but I will better
| specify that my code is open source in the libraries and
| models section, thank you for the suggestion and the
| appreciation!
| niedev wrote:
| Ok, I think I found the perfect solution, In the readme I
| added near open-source "(almost)" with a link to the
| libraries and models section when I explained clearer
| what components are open-source, closed-source, and cc-
| by-nc.
| niedev wrote:
| I know that, but since I make money only from donations without
| giving anything in return of the money I am considered non
| commercial. For now I decided to use NLLB instead of Opus
| because it has good quality and is a single model for all
| languages (so I didn't have to implement the download
| management logic for the various languages and I was able to
| make the app faster), but it is a temporary solution, I will
| certainly replace NLLB in the future, either with OpusMt Big or
| with other models with a less restrictive license.
| EMIRELADERO wrote:
| Sidenote: I don't think ML model weights are even copyrightable
| khimaros wrote:
| this looks really exciting! sadly running into an issue with TTS
| initialization on CalyxOS with the eSpeak TTS engine.
| the4anoni wrote:
| I have same issue on LOS 21 and RHvoice.
| niedev wrote:
| I tested the app with the Google and Samsung TTS, in these days
| I will also test with other TTS including the 2 mentioned here
| and I will try to solve it, I will update you in the issue you
| opened.
| 8mobile wrote:
| Congratulations to the translator, very nice and clean. Will you
| release a version for iOS or Progressive Web App in the future?
| niedev wrote:
| Thank you, for now it's not in my plans, but if the app is
| successful and I make enough money with donations in the future
| I might consider it (since I would have to buy a Mac and at
| least 2 iPhones to test the Conversation mode), and always if
| the App store's policies will not be too restrictive.
| tonetegeatinst wrote:
| Any change on an fdroid release?
| niedev wrote:
| I was considering adding it in the future but I don't know if
| it meets all the necessary conditions given that uses Ml-kit
| (closed source) for language recognition in the WalkieTalkie
| mode.
| t0bia_s wrote:
| After downloading library I got error: _There was an error with
| the tts initialising._ Retry won 't do anything.
| niedev wrote:
| For now I have tested the app with Samsung's and Google's TTS,
| and with these 2 it should work, I am investigating the problem
| and if possible I will add support for other TTS engines.
| t0bia_s wrote:
| Samsung with LineageOS 19 without gaaps to be precise.
| niedev wrote:
| Samsung TTS? That's strange
| tmjwid wrote:
| If I was to guess, it might be RHVoice from FDroid as
| they are using a custom rom.
| nanovision wrote:
| This looks neat. Surprised to find that it only got 40 upvotes on
| Producthunt.
|
| Just bookmarked on Github. Any chance to launch it for iOS as
| well?
| niedev wrote:
| Thank you! I've done the launch on ProductHunt years ago, when
| the app still used Google APIs and required the developer key,
| I'll make another one soon. Regarding the release of the iOS
| version for now it's not in my plans, but if the app is
| successful and I make enough money with donations in the future
| I might consider it (since I would have to buy a Mac and at
| least 2 iPhones to test the Conversation mode), and always if
| the App store's policies will not be too restrictive.
| ChrisMarshallNY wrote:
| As an iOS developer, that could definitely use this, I
| completely support you, in keeping it Android native.
|
| I'm a believer in native software (I write native software).
| I think it results in _much_ higher Quality user experience,
| than hybrid development.
|
| There's a reason that it's not super-popular, though. It's a
| fair bit more expensive than hybrid approaches, and generally
| requires a higher skill level, as well. Most companies find
| that hybrid approaches are "good enough," and they make much
| better margins.
|
| That said, you could probably test it, using the Xcode
| Simulator. It's gotten pretty good with I/O, lately.
| Bluetooth still requires a physical device, but most of the
| stuff could probably be tested fine, if you used a modular
| approach (my preferred methodology).
| niedev wrote:
| Yes I agree with you, moreover for now I am much better as
| a Java or C developer than as a web developer, so for me
| the native approach is also simpler. However, yes, I
| practically never use the emulator even for Android
| precisely because I have to test Bluetooth, which at least
| in my experience on Android, is the most difficult part.
| ChrisMarshallNY wrote:
| I wrote up this series, a few years ago[0]. It was
| actually #1 on HN for a day, but is probably a bit
| "dated," now.
|
| I did do an actual video tutorial on it for try!Swift[1].
|
| [0] https://littlegreenviper.com/series/bluetooth/
|
| [1] https://github.com/ChrisMarshallNY/ITCB-master
| niedev wrote:
| Really cool tutorial! When/if I start developing for iOS
| this will be really useful, thanks!
| NKosmatos wrote:
| Nice one! Thanks for sharing this.
|
| If possible, it would be good to include a list of devices that
| you know performance is good. We can all understand that most
| flagship mobiles will run it smoothly, but what about the average
| user?
|
| "...to be able to use the app without the risk of crashing you
| need a phone with at least 6GB of RAM, and to have a good enough
| execution time you need a phone with a fast enough CPU."
| niedev wrote:
| Thank you! The app can be used on any device that has more than
| 6GB of RAM, with a phone with the right amount of RAM but a
| slower CPU the only effect is that the app takes longer to
| perform the translation and voice recognition (at least from my
| tests it shouldn't lag, so it's always usable), but yes, maybe
| in the future when I have the opportunity to test it with
| enough devices I will make a list with the performances for
| various Socs.
| spacemanspiff01 wrote:
| What does the latency look like with this?
| niedev wrote:
| With a Snapdragon 8 Plus Gen 1, I measured about 2 seconds for
| speech recognition of 30s of audio (although for shorter
| durations it takes not much less than 2s), while for
| translation about 2 seconds for a text of 300 characters (in
| this case, however, the performance scales linearly based on
| the length of the text).
| pcthrowaway wrote:
| Suggestion
|
| > RTranslator is an open-source (almost), free, and offline real-
| time translation app for Android.
|
| change to:
|
| > RTranslator is an (almost) open-source, free, and offline real-
| time translation app for Android.
|
| Reading it the first way, it's possible to misunderstand and
| think it's almost (but not completely) _free_ , rather than
| almost (but not completely) open-source
| niedev wrote:
| Thank you, you are right, I will do that.
| teleforce wrote:
| Great we are more than a century ahead of schedule for Star Trek
| universal translator [1].
|
| Joking aside, this will be very beneficial software if it can
| work seamlessly in real-time for countries like Japan who can't
| speak English, and France who won't speak English.
|
| [1] Star Trek's Translator Technology, Explained:
|
| https://gamerant.com/star-trek-translator-technology-explain...
| niedev wrote:
| That would be the end goal (very far :D). But yeah, joking
| aside, I aim to make the translation better and better as
| technology advances, the current level is a usable start but
| maybe one day it will be truly seamless.
| billylo wrote:
| Hi, I made a non-local version that supports translation to
| other languages.
|
| Android:
| https://play.google.com/store/apps/details?id=org.evergreenl...
| iOS: https://apps.apple.com/app/3po/id6503194251
|
| p.s. I contemplated calling it Sato, ended up using 3PO as the
| name. :-)
| ssrlcc wrote:
| This app is very cool! In walkie talkie mode, you may want to
| tweak the UI to make it clearer when the microphone is listening
| and when it's not. I saw the microphone icon changes color, but a
| stronger visual hue may help. Google translate may be a good
| point of reference - in its conversation mode the shape of the
| microphone icon changes when the mic is active. I've also noticed
| that my message is sometimes cut short when translated, it only
| translates the first half.
| niedev wrote:
| Thank you for the feedback! In a little while I plan to redo
| the entire graphic interface, and yes a method to make it
| clearer when the microphone is listening is already in the
| plans. As for the cutting of sentences you can probably solve
| it by increasing the microphone's sensitivity from the app
| settings (from there you can also change other settings
| regarding the activation of the microphone, but most likely it
| is a sensitivity problem).
| sillystuff wrote:
| FYI, After model download, I get an error: There
| was an error with the tts initialization
|
| Using LineageOS+Microg and eSpeak for TTS (equivalent android
| version = 13). This setup works fine for TTS with the commercial
| map application, Here WeGo Maps.
| niedev wrote:
| Unfortunately you are not the only one having this problem
| (https://github.com/niedev/RTranslator/issues/8), these days I
| will try to solve this bug, for now it has top priority.
| sillystuff wrote:
| Thanks for pointing me to that bug report. And, thanks for
| working on it :).
|
| Your initial fix of not disabling the entire app's
| functionality if it can't get TTS working sounds like a good
| one.
| gslepak wrote:
| If you are doing translation locally on the device, why does your
| Privacy Policy say you are sending voice and transcriptions as
| well as personal location data to Google?
|
| _> RTranslator in any case collects and processes data that will
| then be sent to Google, such as: audio, the transcription of
| which is transmitted at a later time via bluetooth to the phone
| with which you are communicating, and the transcription of the
| audio received by the other user, to carry out the translation._
|
| --
| https://github.com/niedev/RTranslator/blob/v2.00/privacy/Pri...
| niedev wrote:
| The position sent to Google is an error that I didn't see (I
| will correct it in 5 minutes), the rest, as I say in the
| readme, is due to the fact that the privacy policy used for
| RTranslator 2.0 for now is the same as RTranslator 1.0 (that
| used Google's Api), I will make a new privacy policy contract
| for RTranslator 2.0 with a lawyer soon.
___________________________________________________________________
(page generated 2024-06-19 23:01 UTC)