https://github.com/eschluntz/compress Skip to content Toggle navigation Sign up * Product + Actions Automate any workflow + Packages Host and manage packages + Security Find and fix vulnerabilities + Codespaces Instant dev environments + Copilot Write better code with AI + Code review Manage code changes + Issues Plan and track work + Discussions Collaborate outside of code Explore + All features + Documentation + GitHub Skills + Blog * Solutions For + Enterprise + Teams + Startups + Education By Solution + CI/CD & Automation + DevOps + DevSecOps Resources + Customer Stories + White papers, Ebooks, Webinars + Partners * Open Source + GitHub Sponsors Fund open source developers + The ReadME Project GitHub community articles Repositories + Topics + Trending + Collections * Pricing Search or jump to... Search code, repositories, users, issues, pull requests... Search [ ] Clear Search syntax tips Provide feedback We read every piece of feedback, and take your input very seriously. [ ] [ ] Include my email address so I can be contacted Cancel Submit feedback Saved searches Use saved searches to filter your results more quickly Name [ ] Query [ ] To see all available qualifiers, see our documentation. Cancel Create saved search Sign in Sign up You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert {{ message }} eschluntz / compress Public * Notifications * Fork 3 * Star 149 Text compression for generating keyboard expansions 149 stars 3 forks Activity Star Notifications * Code * Issues 1 * Pull requests 0 * Actions * Projects 0 * Security * Insights More * Code * Issues * Pull requests * Actions * Projects * Security * Insights eschluntz/compress This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository. master Switch branches/tags [ ] Branches Tags Could not load branches Nothing to show {{ refName }} default View all branches Could not load tags Nothing to show {{ refName }} default View all tags Name already in use A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch? Cancel Create 1 branch 0 tags Code * Local * Codespaces * Clone HTTPS GitHub CLI [https://github.com/e] Use Git or checkout with SVN using the web URL. [gh repo clone eschlu] Work fast with our official CLI. Learn more about the CLI. * Open with GitHub Desktop * Download ZIP Sign In Required Please sign in to use Codespaces. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching GitHub Desktop If nothing happens, download GitHub Desktop and try again. Launching Xcode If nothing happens, download Xcode and try again. Launching Visual Studio Code Your codespace will open once ready. There was a problem preparing your codespace, please try again. Latest commit @eschluntz eschluntz readme ... 72cb8cc Aug 30, 2023 readme 72cb8cc Git stats * 76 commits Files Permalink Failed to load latest commit information. Type Name Latest commit message Commit time .github/workflows fix August 24, 2023 11:55 data cleanup August 26, 2023 16:04 img readme August 30, 2023 11:39 output readme August 30, 2023 11:39 test_data more tests August 26, 2023 16:39 .coveragerc cleanup October 9, 2021 19:29 .gitignore simplified algorithm August 26, 2023 15:02 .pylintrc pylint! August 24, 2023 21:50 find_suggested_phrases.py more tests August 26, 2023 16:39 generate_autokeys.py more tests August 26, 2023 16:39 install.sh cleanup August 24, 2023 11:28 parse_slack.py more tests August 26, 2023 16:39 preset_abbrevs.py readme August 30, 2023 11:39 readme.md readme August 30, 2023 11:39 reload.sh cleanup August 24, 2023 11:28 requirements.txt cleanup August 26, 2023 16:13 run_tests.sh pylint! August 24, 2023 21:50 test_compress.py more tests August 26, 2023 16:39 View code Compress What phrases should I abbreviate? How to pick abbreviations? Instructions Notes readme.md codecov Build Status Compress demo This is a tool for automatically creating typing shortcuts from a corpus of your own writing! I use these shortcuts mainly for email and slack: email slack This repo parses a corpus of text and suggest what shortcuts you should use to save the most letters while typing. It then generates config files for Autokey, a linux program that implements keyboard shortcuts! It also contains a tool for optionally parsing a Slack Data Export of your messages to create a corpus. What phrases should I abbreviate? The code looks through the corpus to find common n-grams that can be replaced with much shorter phrases. The suggestions are ranked by [characters saved] * [frequency of phrase]. I was surprised that very short and frequent words topped this list, such as the -> t, instead of longer phrases that I use a lot, such as what do you think -> wdytk. results Just reading through the results was amusing to see how repetitive some of my writing is :) How to pick abbreviations? This is largely preferences and heuristics to try to generate memorable abbreviations for different phrases. Some of my design philosphies were: 1. The abbrev cannot be a word that I want to type. Right now this is done with a blacklist, but I should change it to use my actual corpus. 2. The goal is being memorable. 1st letter is top choice, and 1st letter + last letter is next choice. 3. More common phrases get priority for more memorable abbrevs. This is currently done as a manual post-process step, but I like to make "families" of abbrevs to make them more memorable. Some example heuristics for this are: 1. Plurals should have the same abbrev as the singular, but with an "s". For example robot -> r and robots -> rs. 2. If a word has an abbrev, a phrase that contains that word should contain the abbrev. For example: the -> t robot -> r the robot -> tr 3. Think about how similar words' abbrevs can be similar as well. i.e. some -> s someone -> sn something -> st sometime -> sti Instructions 1. run install.sh to install dependencies. Currently tested on python 3.10.12 2. Put any corpus of your text that you want to compress in data/ corpus/*.txt 3. If you want to use your slack history as a corpus: 1. export it to a folder called data/slack_export. Only slack workspace admins can do this (and it only exports public channels). 2. Change USERNAME_TO_EXPORT at the top of the file to your slack username. 3. Run parse_slack.py. This will generate a new corpus document in data/corpus/ 4. DELETE YOUR SLACK EXPORT WITH srm 4. Run find_suggested_phrases.py. This will generate a list of the top 200 suggested shortcuts to output/suggested_shortcuts.yaml 5. Edit or add any shortcuts that you want, then copy the file to shortcuts.yaml. + This is a manual step so you can customize it without it being blown out every time you run the script again. + It's also saved in git even though it's an output so that I can keep it in sync across multiple of my computers :) + If you're starting out, I suggest just going with 10-20 shortcuts to make it easier to remember them 6. Run generate_autokeys.py to convert shortcuts.yaml into actual config files for autokey. 7. Install Autokey + Right now, Autokey is only supported on linux with X11, not Wayland 8. Symlink the output into autokey's config: ln -s output/ autokey_phrases ~/.config/autokey/data/My Phrases/ 9. From now on when you edit shortcuts.yaml you can re-generate and reload autokey with reload.sh Notes Autokey Uses simulated keyboard input to replace phrases with your abbreviations. I tried several chrome extensions but this worked much more reliably without conflicting with sites' own javascript. The config files I generate are set to only apply when Chrome is in focus because that's where I do most of my english typing. I found that keeping this active in terminal and vscode caused way more problems than it was solved because my abbreviations overlapped with common short linux commands and variable names i.e. t. About Text compression for generating keyboard expansions Resources Readme Activity Stars 149 stars Watchers 2 watching Forks 3 forks Report repository Releases No releases published Packages 0 No packages published Contributors 2 * @eschluntz eschluntz Erik Schluntz * @cobalt-robots cobalt-robots Cobalt Robotics Languages * Python 97.6% * Shell 2.4% Footer (c) 2023 GitHub, Inc. Footer navigation * Terms * Privacy * Security * Status * Docs * Contact GitHub * Pricing * API * Training * Blog * About You can't perform that action at this time.