[HN Gopher] Greppability is an underrated code metric
___________________________________________________________________
Greppability is an underrated code metric
Author : thunderbong
Score : 1161 points
Date : 2024-09-03 02:47 UTC (1 days ago)
(HTM) web link (morizbuesing.com)
(TXT) w3m dump (morizbuesing.com)
| JoshTriplett wrote:
| This is the reason many coding styles and tools (including the
| Linux kernel coding style and the default Rust style as
| implemented in rustfmt) do not break string constants across
| lines even if they're longer than the desired line length: you
| might see the string in the program's output, and want to search
| for the same string in the code to find where it gets shown.
| knodi123 wrote:
| My team drives me bonkers with this. They hear the general
| principle "really long lines of code are bad", but extrapolate
| it to "no characters shall pass the soft gutter no matter
| what".
|
| Even if you have, say, 5 sequential related structs, that are
| all virtually identical, all written on one line so that the
| similarities and differences are obvious at a mere glance...
| Then someone comes through and touches my file, and while
| they're at it, "fix" the line that went 2 characters past the
| 80 mark by reformatting the 4th struct to span several lines.
| Now when you see that list of structs, you wonder "why is this
| one different?" and you have to read carefully to determine,
| nope, it just contained one longer string. Or god forbid the
| reformat all the structs to match, turning a 1-page file into 3
| pages, and making it so you have to read and understand each
| element of each struct just to see what's going on.
|
| If I could have written the rule of thumb, I would have said
| "No logic or control shall happen after the end of the gutter."
| But if there's a paragraph-long string on one line- who cares??
| We all have a single keystroke that can toggle soft-wrap, and
| the odds that you're going to need to know anything about that
| string other than "it's a long string" are virtually nil.
|
| Sorry. I got triggered. :-)
| yas_hmaheshwari wrote:
| My team also had a similar thing in place. I am saving this
| article in my pocket saves, so that I can give "proofs" of
| why this is better
|
| From Zen of Python: ``` Special cases aren't special enough
| to break the rules. Although practicality beats purity. ```
| https://peps.python.org/pep-0020/
| BigJono wrote:
| Yep this triggers the fuck out of me too. It drives me
| absolutely insane when I'm taking the time and effort to
| write good test cases that use inline per test data that I've
| taken the time to format so it's nice and readable for the
| next person, then the next person comes along, spends 30
| seconds writing some 2 line rubbish to hit a code coverage
| metric, then spends another 60 seconds adding a linter rule
| that blows all the test data out to 400 lines of unreadable
| dogshit that uses only the left 15% of screen real estate.
| port19 wrote:
| I routinely spot 3-line prints with the string on its own
| line in our code. Even for cases where the string + print
| don't even reach the 80 character "limit"
| arp242 wrote:
| This is why autoformatters that frob with line endings are
| just terrible and fundamentally broken.
|
| I'm fairly firmly in the "wrap at 80" camp by the way; but
| sometimes a tad longer just makes sense. Or shorter for that
| matter: forced removal of line breaks is just as bad.
| jimmaswell wrote:
| 80 feels really impractically narrow. A project I work on
| uses 110 because it's approximately the widest you can
| comfortably compare two revisions on the same monitor, or
| was for some person at some time, and I can live with it,
| but any less would just feel so cramped. A few indentation
| levels deep and I'd be writing newspaper columns.
| NotMichaelBay wrote:
| There is usually a way to restructure the code so that it
| doesn't have multiple levels of nested indentation, which
| is a good practice IMO because it makes the code easier
| to read.
| knodi123 wrote:
| 80 lines wide is the width we had back in the late 90s.
| Displays are nothing like that anymore. I managed to talk
| my team into setting the linter to something more
| reasonable, but individuals still feel like they're being
| virtuous if they stick to 80 and reformat any line they
| touch that goes over. It's just dogma!
| kiitos wrote:
| 80 _is_ impractically narrow. Even 120 is overly strict.
| SLoC line length isn't something that can or should be
| enforced by linters, or re-formatted by formatters.
| edflsafoiewq wrote:
| This is world autoformatters have wrought. The central dogma
| of the autoformatter is that "formatting" is based on dumb
| syntactic rules with no inflow of imprecise human judgements.
| scrollaway wrote:
| Most autoformatters do not reformat string constants as GP
| has said, and even if they did, this is something that can
| be much more accurately and correctly specified with an AF
| than with a human.
|
| Autoformatting collectively saves probably close to
| millions of work hours per year in our industry, and that's
| at the current adoption. Do you think it's productive to
| manually space things out, clean up missing trailing commas
| and what not? Machines do it better.
| edflsafoiewq wrote:
| > Even if you have, say, 5 sequential related structs,
| that are all virtually identical, all written on one line
| so that the similarities and differences are obvious at a
| mere glance... Then someone comes through and touches my
| file, and while they're at it, "fix" the line that went 2
| characters past the 80 mark by reformatting the 4th
| struct to span several lines.
|
| Autoformatters absolutely do this. They do not understand
| considerations like symmetry.
|
| I am doubtful as to the costs of "somewhere in the
| codebase there is a missing trailing comma".
| JoshTriplett wrote:
| The wins of autoformatting are 1) never having to have a
| dispute over formatting or have formatting depend on who
| last touched code, 2) never manually formatting code or
| depending on someone's editor configuration, 3) having CI
| verify formatting, and 4) not having someone
| (intentionally or unintentionally) make unrelated
| formatting changes in a commit.
|
| Also, autoformatters can be remarkably good. For
| instance, rustfmt will do things like:
| x.func(Some(MyStruct { field: big + long +
| expr, field2, }));
|
| rather than mindlessly introducing three levels of
| indentation.
| EasyMark wrote:
| I have been places where we allow long strings, but other
| things aren't allowed and generally 80 to 100 char limits
| otherwise. I like 100 for c++/java and 80 for C. If it gets
| much longer than that (not being strings) then it's time for
| a rethink in most cases, grouping/scoping symbols are getting
| too deep. I'm sure other languages may or may not have that
| as a reasonable argument. It is just a rule of thumb though.
| bobbylarrybobby wrote:
| If I recall, rustfmt had a bug where long string literals (say,
| over 120 chars or so -- or maybe if it was that the string was
| long enough to extend beyond the gutter when properly
| indented?) would prevent formatting of the entire file they
| were in. Has this been fixed?
| JoshTriplett wrote:
| Not the whole file, but sufficiently long un-line-breakable
| code in a complex statement can cause rustfmt to give up on
| trying to format _that statement_. That 's a known issue that
| needs fixing.
| abc-1 wrote:
| A lot of this reads like code search tools could and should be a
| lot better. They probably will be with AI finding its way into
| everything. In the old days, people would Hungarian prefix types,
| but now the IDE mitigates that with color codes.
| klodolph wrote:
| Do you have some ideas for how to make code search better?
|
| Right now, code search is basically just text search. If you
| think code search tools "could and should" be a lot better,
| what kind of improvements are you thinking about? How would
| those improvements work?
| Terr_ wrote:
| Not OP, but we wouldn't need to worry so much about picking
| out distinct greppable names _if_ (big if) there were tools
| that parsed the code to draw out concepts for us, ex:
|
| 1. The popular "Find Usages" which varies widely in accuracy
| and reliability by language, IDE, and codebase meta-quirks.
|
| 2. Tools that show Callee/Caller trees, and sometimes
| possible data-flows between variables.
|
| 3. DSLs to search hierarchies, like how XPath lets you find
| XML elements based on nesting, rather than relying on a
| distinctly greppable _single_ tag-name for the leaf you 're
| interested in. (e.g. `<Product><Name>` vs `<ProductName>`)
|
| When things go well, the actual variable name no longer needs
| to restate certain aspects and relationships that can instead
| be found through metadata.
|
| For example, `GiftCard.purchaser_customer_uuid` is nicely
| greppable, but you could relax that to `GiftCard.purchaser`
| if it had a static type of `UUID<Customer>`. Or perhaps you
| could go to the `Customer.uuid` definition and say "Show me
| all variables that can populate or be-populated-by this one,
| up to X steps out, and excluding ones that are function
| scoped."
|
| That said, I _do_ advocate for "greppability" as a general
| practice, since I seldom trust that languages, tools, or
| institutions will come together in a way that makes it
| unnecessary.
| klodolph wrote:
| I guess I wasn't thinking of "find usages", but as the
| article points out, it's hard to find usages if the usages
| are dynamic.
|
| The solution--to write code which is _less_ dynamic--helps
| code search and features like find usages.
| alexpovel wrote:
| Regarding your third point, I put together a tool capable
| of that to some degree.
|
| It allows you to grep inside source code, but limit the
| search to e.g. "only docstrings inside class definitions",
| among other things. That is, it allows nesting and is
| syntax aware. That example is for Python, but the tool
| speaks more languages (thanks to treesitter).
|
| https://github.com/alexpovel/srgn/blob/main/README.md#multi
| p...
| abc-1 wrote:
| Vector embeddings.
| dragonwriter wrote:
| > Right now, code search is basically just text search.
|
| We have lots of code search that is much more syntax-aware
| than just text search, but it tends to be behind very limited
| UI, because we have all the tech to do much better code
| search, but no one has come up with a generally-usable UI for
| it, so we just have very specific instances -- like "go to
| definition", "find references" , etc.
|
| That takes all the same technological bits that would be need
| for, say, "find all definitions of functions visible in the
| current scope whose name starts with 'ban'" or "find all
| definitions of int8 constants visible in the current
| scope"...but what's the UI that makes that kind of searching
| outside of the kind of special cases now behind their own IDE
| menu items usable?
| ddfs123 wrote:
| Unless you have syntax-aware grep support, I don't see how
| searching nested key json could be better. But grep is the
| default installed. Not to mention ad-hoc languages that does
| not have any IDE support.
| abc-1 wrote:
| If you put a lot of arbitrary constraints to not allow it to
| be better, sure. Enjoy.
| medstrom wrote:
| There is no conflict between improving tools and learning
| how to express your code in such a way that as many tools
| as possible work better OOTB.
| uasi wrote:
| gron makes nested JSON greppable
| https://github.com/tomnomnom/gron
| hoherd wrote:
| `gron` is so underrated. Usually when I try to show people
| who useful it is they don't seem to understand how powerful
| it is. One common use is showing how to customize only one
| part of a helm chart by checking values of an already
| installed chart: $ helm get values -n $NS
| $DEPLOYMENT -o json | gron | grep resources | gron -u |
| json-to-yaml.py elasticsearch: client:
| resources: limits: cpu: 3
| memory: 4Gi requests: cpu: 1
| memory: 2Gi data: resources:
| limits: cpu: 6 memory: 6Gi
| requests: cpu: 200m memory:
| 2Gi fluentd: resources:
| limits: memory: 768Mi requests:
| memory: 384Mi
|
| That snip could be provided to another team or a customer
| as a yaml file that could be included with `helm upgrade -f
| whatever.yaml`. This is soooo much easier than digging that
| limited set of data out of the much more detailed data.
| NavinF wrote:
| > ad-hoc languages
|
| This is self-inflicted.
| ralusek wrote:
| Especially in untyped languages, working with an old or
| unfamiliar codebase, sometimes the only way to know "was anything
| else using this code" is just to search for the name of a
| function or whatever.
| adpirz wrote:
| I've seen some pretty wild conditional string interpolation where
| there were like 3-4 separate phrases that each had a number of
| different options, something akin to `${a ? 'You' : 'we'} {b ?
| 'did' : 'will do' } {c ? 'thing' : 'things' }`.
|
| When I was first onboarding to this project, I was tasked with
| updating a component and simply tried to find three of the words
| I saw in the UI, and this was before we implemented a
| straightforward path-based routing system. It took me far too
| long just to find what I was going to be working on, and that's
| the day I distinctly remember learning this lesson. I was pretty
| junior, but I'd later return to this code and threw it all away
| for a number of easily greppable strings.
| ctxc wrote:
| Tangential: I love it when UIs say "1 object" and "2 objects".
| Shows attention to detail.
|
| As opposed to "1 objects" or "1 object(s)". A UI filled with
| "(s)", ughh
| petepete wrote:
| Moreso when it's not tripped up by "1 sheeps" or "1
| diagnoses".
| gnuvince wrote:
| I like the more robotic "Objects: 1" or "Objects: 2", since
| it avoids the pluralization problems entirely (e.g., in
| French 0 is singular, but in English it's plural; some words
| have special when pluralized, such as child -> children or
| attorney general -> attorneys general). And related to this
| article, it's more greppable/awkable, e.g. `awk /^Objects:/
| && $2 > 10`.
| ajuc wrote:
| Fun fact - I had to localize this kind of logic to my
| language (Polish). I realized quickly it's fucked up.
|
| This is roughly the logic: function
| strFromNumOfObjects(n) { if (n === 1) {
| return "obiekt"; } let last_digit =
| (n%10); let penultimate_digit =
| Math.trunc((n%100)/10); if ((penultimate_digit == 0
| || penultimate_digit >= 2) && last_digit > 1 && last_digit <=
| 4) { return "obiekty"; }
| return "obiektow"; }
|
| Basically pluralizing words in Polish is a fizz-buzz problem
| :) In other Slavic languages it should be similar BTW
| nox101 wrote:
| Sounds like you're going to have a bad time
|
| https://www.foo.be/docs/tpj/issues/vol4_1/tpj0401-0013.html
| RodgerTheGreat wrote:
| one of the strangest and most grep-hostile approaches to
| identifiers that I have ever observed is Nim ignoring both case
| and underscores in an effort to allow everyone to write code in
| their preferred style:
|
| https://nim-lang.org/docs/manual.html#partial-caseminusinsen...
| uasi wrote:
| Nim even provides a dedicated grep-like tool to search for
| identifiers regardless of the style https://nim-
| lang.org/docs/nimgrep.html
| planetis wrote:
| And it works pretty well, coming from 6+ years of experience.
| It's not that strange if consider case insensitive filesystems
| and email addresses. But on the internet you only hear the
| opinion of the loudest minority.
| dblotsky wrote:
| Hard agree with the idea of greppability, but hard disagree about
| keeping names the same across boundaries.
|
| I think the benefit of having one symbol exist in only one domain
| (e.g. "user_request" only showing up in the database-handling
| code, where it's used 3 times, and not in the UI code, where it
| might've been used 30 times) reduces more cognitive load than is
| added by searching for 2 symbols instead of 1 common one.
| Noumenon72 wrote:
| Not to mention the readability hit from identifiers like
| foo.user_request in JavaScript, which triggers both linters and
| my own sense of language convention.
| emn13 wrote:
| Both of those are easy to fix. You'll adapt quickly if you
| pick a different convention.
|
| Additionally, I find that in practice such "unusual" code is
| actually beneficial - it often makes it easy to see at a
| glance that the code is somehow in sync with some external
| spec. Especially when it comes to implicit usages such as in
| (de)serialization, noticing that quickly is quite valuable.
|
| I'd much rather trash _every_ languages ' coding conventions
| than use subtly different names for objects serialized and
| shared across languages. It's just a pain.
| plorkyeran wrote:
| I've also found that I sometimes really like when I grep for a
| symbol and hit some mapping code. Just knowing that some value
| goes through a specific mapping layer and then is never
| mentioned again until the spot where it's read often answers
| the question I had by itself, while without the mapping code
| there'd just be no occurrences of the symbol in the current
| code base and I'd have no clue which external source it's
| coming from.
| runevault wrote:
| Probably depends on how your system is structured. if you know
| you only want to look in the DB code, hopefully it is either
| all together or there is something about the folder naming
| pattern you can take advantage of when saying where to search
| to limit it.
|
| The upside to doing it this way is it makes your grepping more
| flexible by allowing you to either only search the one part of
| the codebase to see say DB code or see all the DB and UI things
| using the concept.
| gregjor wrote:
| I have mixed thoughts on this too. Fortunately grep (rg in my
| case) easily handles it:
|
| rg -i 'foo.?bar' finds all of foo_bar, fooBar, and FooBar.
| ajayvk wrote:
| Just spent an hour trying to figure out how a Hugo theme was
| picking up a shortcode definition. Grep did not help.
|
| Turned out the shortcode name is based on the file name rather
| than file contents.
| db48x wrote:
| Rust and Javascript and Lisp all get extra points because they
| put a keyword in front of every function definition. Searching
| for "fn doTheThing" or "defun do-the-thing" ensures that you find
| the actual definition. Meanwhile C lacks any such keyword, so the
| best you can do is search for the name. That gets you a sea of
| callers with the declarations and definitions mixed in. Some C
| coding conventions have you split the definition into two lines,
| first the return type on a line followed by a second line that
| starts with the function name. It looks ugly, but at least you
| can search for "^doTheThing" to find just the definition(s).
| CGamesPlay wrote:
| Not JavaScript. Cool kids never write "function" any more, it's
| all arrow functions. You can search for const, which will
| typically work, but not always (could be a let, var, or multi-
| const intializer).
| lispisok wrote:
| Am I the only one who hates arrow functions?
| spartanatreyu wrote:
| I did, until I used them enough where I saw where they were
| useful.
|
| The bad examples of arrow functions I saw initially were
| of:
|
| 1. Devs trying to mix them in with OOP code as a bandaid
| over OOP headahes (e.g. bind/this) instead of just not
| using OOP in the first place.
|
| 2. Devs trying to stick functional programming everywhere
| because they had seen a trivial example where a `.map()`
| made more semantic sense than a for/for-in/for-of loop.
| Despite the fact that for/for-in/for-of loops were easier
| to read for anything non-trivial and also had better
| performance because you had access to the `break`,
| `continue` and `return` keywords.
| mewpmewp2 wrote:
| Another benefit of using for instead of array fns is that
| it is easy to add await keyword should the fn become
| async.
|
| But many teams will have it as a rule to always use array
| fns.
| jappgar wrote:
| That gives you have the option of making it serially
| async but not parallel, which can be achieved easily
| using Promise.all in either scenario.
| adregan wrote:
| As an aside: It's way less ergonomic, but you likely want
| `Promise.allSettled` rather than `Promise.all` as the
| first promise that throws aborts the rest.
| wruza wrote:
| It doesn't really abort the rest, it just prioritizes the
| selection of a first catch-path as a current
| continuation. The rest is still thenable, and there's no
| "abort promise" operation in general. There are abort
| signals, but it's up to an async process to accept a
| signal and decide when/whether to check it.
| adregan wrote:
| Admittedly, I was being a bit hand-wavy and describing a
| bit more of how it feels rather than the way it is (I'm
| perpetually annoyed that promises can't be cancelled),
| but I was thinking of the code I've seen many times
| across many code bases: let results;
| try { results = await
| Promise.all(vals.map(someAsyncOp)) } catch (err)
| { console.error(err) }
|
| While you _could_ pull that promises mapping into a
| variable and keep it thenable, 99% of the time I see the
| above instead. Promises have some rough edges because
| they are stateful, so I think it might be easier to
| recommend swapping that Promise.all for an
| Promise.allSettled, and using a shared utility for
| parsing the promise result.
|
| I consider this issue akin to the relationship between
| `sort`, `reverse`, `splice`, the mutating operation APIs,
| and their non mutating counterparts `toSorted`,
| `toReversed`, `toSpliced`. Promise.all is kind of the
| mutating version of allSettled.
| throwaway2037 wrote:
| > also had better performance because you had access to
| the `break`, `continue` and `return` keywords.
|
| This is a great point.
|
| One more: Debugging `.map()` is also much harder than a
| for loop.
| medstrom wrote:
| I feel there are a few ways to invoke .map() in a
| readable way and many ways that make the code flow
| needlessly indirect.
|
| Should be a judgment call, and the author needs to be
| used to doing both looping and mapping constructs, so
| that they are unafraid of the bit of extra typing needed
| for the loop.
| ndnxncjdj wrote:
| I very much prefer the way scoping is handled in arrow
| functions.
| nosianu wrote:
| A simple heuristic I use is to use arrow functions for
| inline function arguments, and named "function" functions
| for all others.
|
| One reason is exactly what the subject of discussion is
| here, it's easier to string-search with that keyword in
| front of the name, but I don't need that for trivial inline
| functions (whenever I do I make it an actual function that
| I declare normally and not inline).
|
| Then there's the different handling of "this", depending on
| how you write your code this may be an important reason to
| use an arrow function in some places.
| turboponyy wrote:
| I like them because it reinforces the idea that functions
| are just values like any other - having a separate keyword
| feels like it is inconsistent.
| 0xfffafaCrash wrote:
| Moreover the binding and lexical scope aspects supported
| by classic functions are amongst the worst aspects of the
| language.
|
| Arrow functions are also far more concise and ergonomic
| when working with higher order functions or simple
| expressions
|
| The main thing to be wary of with arrow functions is when
| they are used anonymously inline without it being clear
| what the function is doing at a glance. That and Error
| stack traces but the latter is exacerbated by there being
| no actual standard regarding Error.prototype.stack
| mewpmewp2 wrote:
| Why do you want to reinforce that idea?
|
| To me arrow functions mostly just decrease readability
| and makes them blend in too much, when it should be
| important distinction what is a function and what is not.
| turboponyy wrote:
| Not to be dismissive, but because I like it - it just
| sits right with me.
| benrutter wrote:
| I'm not a javascript programmer, but I really like the
| arrow pattern from a distance exactly because it enforces
| that idea.
|
| My experience is that newcomers are often thrown off and
| confused by higher order functions. I think partly
| because, well let's be honest they just _are_ more
| confusing than normal functions, but I think it 's also
| because languages often bind functions differently from
| everything else.
|
| `const cool = () => 5`
|
| Makes it obvious and transparent, that `cool' is just a
| variable where as:
|
| `function cool() {return 5}`
|
| looks very different from other variable bindings.
| throwitaway1123 wrote:
| Since we're on the topic of higher order functions, arrow
| functions allow you to express function currying very
| succinctly (which some people prefer). This is a
| contrived example to illustrate the syntactical
| differences: const arrow = (a) => (b) =>
| `${a}-${b}` function verbose(a) {
| return function (b) { return `${a}-${b}`
| } } function uncurried(a, b) {
| return `${a}-${b}` } const values =
| ['foo', 'bar', 'baz'] values.map(arrow('qux'))
| values.map(verbose('qux'))
| values.map(uncurried.bind(null, 'qux'))
| values.map((b) => uncurried('qux', b))
| nsonha wrote:
| > should be important distinction what is a function and
| what is not
|
| code is to express logic clearly to the reader. We should
| assess it for that purpose, before assess for any
| derivative, secondary concern such as whether categories
| of things in code (function etc) visually pops out when
| you use some specific tool like vim, or grep. There are
| syntax highlighters for a reason. And maybe if grep sucks
| with code then build the proper tool for code searching,
| instead of writing code after the tool.
| crabmusket wrote:
| I don't like using them _everywhere_ , but they're very
| handy for inline anonymous functions.
|
| But it really pains me when I see
|
| export const foo = () => {}
|
| instead of
|
| export function foo() {}
| creesch wrote:
| Thank you, that's something I also never have understood
| myself. For inline anonymous functions like callbacks
| they make perfect sense. As long as you don't need
| `this`.
|
| But everywhere else they reduce readability of the code
| with no tangible benefit I am aware of.
| berkes wrote:
| I wish javascript had a built-in or at least (defacto)
| default linter. Like go-fmt or rust fmt. Or clippy even.
|
| One that could enforce these styles. Because not only is
| the export const foo = () {}
|
| painful on itself, it will quite certainly get intermixed
| with the
|
| function foo() {}
|
| and then in the next library a
|
| const foo = function() {}
|
| and so on. I'd rather have a consistently irritating
| style, than this willy-nilly yolo style that the JS
| community seems to embrace.
| throwitaway1123 wrote:
| ESLint and Prettier are the de facto default
| linter/formatter combo in JS. There are rules you can
| enable to enforce your preferred style of function
| [1][2].
|
| [1] https://eslint.org/docs/latest/rules/func-style
|
| [2] https://eslint.org/docs/latest/rules/prefer-arrow-
| callback
| berkes wrote:
| They are miles away from `gofmt` or `rust fmt` or `cargo
| clippy` and so on.
|
| It's not opinionated, but requiring you to form your own
| opinion or at least choose from a palette of opinions.
|
| It requires effort to opt-in rather than effort to opt-
| out.
|
| The community doesn't frown on code that's not adhering
| to the common standard or code that doesn't pass the "out
| of the box" linter.
|
| So, if I have a typescript project with a tree of some 20
| dependencies (which is, unfortunately, a tiny project),
| I'll have at least five styles of code when it browse
| through it. Some JS, some TS, some strictly linted with
| "no-bikeshedding", some linted with configs that are
| bigger than the codebase itself. Some linted with
| outdated. Many not linted at all. It's really a mess.
| Even if each of the 20 dependencies themselves are clean,
| beauties, the whole is an inconsistent mess.
| throwitaway1123 wrote:
| I personally believe that Prettier strikes a decent
| balance between being configurable vs being opinionated.
| I've used formatters that are much less opinionated than
| Prettier (e.g. the last time I used XCode it only handled
| indentation by default). ESLint also teeters on the edge
| between configuration vs convention quite well in my
| opinion, especially given the fact that browser JS and
| server side JS have vastly different linting
| requirements. I also love the extensibility of ESLint.
| Being able to write your own linting rules is a boon on
| productivity. Having access to custom framework specific
| linting rules is quite nice (I couldn't use React without
| the react-hooks/rules-of-hooks third party ESLint rules).
|
| > Some JS, some TS
|
| I think the JS community has done remarkably well amongst
| dynamically typed languages in settling on one form of
| gradual typing and adopting it fervently (Flow no longer
| has any market share at all). Whereas the last time I
| checked Python still had the Mypy/Pywright divide, and
| Ruby had the Sorbet/RBS dichotomy.
|
| Ultimately though, most of your critique boils down to
| the fact that JS (unlike Rust and Go) isn't maintained by
| a single monolithic entity, and therefore there's no one
| to dictate the standards you're looking for. If Deno were
| the sole caretaker of JS for example, we'd have a
| standard linter and formatter devoid of complex
| configuration, but Deno doesn't control JS.
|
| This is a consequence of JS being a collaborative product
| of the various browser vendors, TC39, and the server side
| JS runtimes that have adapted JS to run on servers. The
| advantage of this of course though is that JS can run
| natively in the browser. I think that's a decent tradeoff
| to make in exchange for having to wade through
| dependencies with different ideas about when it's
| appropriate to use arrow functions.
| jappgar wrote:
| These aren't equivalent as function foo will be hoisted
| but const foo will not be.
| cxr wrote:
| Sure the food that this restaurant serves is pricey, but
| you have to remember that it also tastes terrible.
| crabmusket wrote:
| Yep, and that usually doesn't matter at the top level.
| NohatCoder wrote:
| But do they make much of a difference? You have always
| been able to write:
| myArray.sort(function(a,b){return a-b})
|
| People for some reason treat this syntactic sugar like it
| gives them some new fundamental ability.
| marcosdumay wrote:
| Oh Javascript would be much better if it could only be
| syntactic sugar...
|
| `function(a,b){return a-b;}` is different from `(a,b) =>
| a - b`
|
| And `function diff(a,b) {return a-b;}` is different from
| `const diff(a,b) => a - b;`.
| nsonha wrote:
| why the need to pronounce arbitrary preferences, who cares?
| ajuc wrote:
| I'm of the opinion that giving a global name to an
| anonymous function should result in a compilation error.
| supriyo-biswas wrote:
| You can still search for `<keyword> = \\(.*\\) => `, albeit
| it's a bit cumbersome.
| post-it wrote:
| All you need is `<keyword> =`
|
| Really, all you need is `<keyword>` and if the first result
| is a call to that function, just jump to its definition.
| spartanatreyu wrote:
| Exactly.
|
| Just search the definition.
|
| Any time that a function doesn't have a definition, it's
| never the target of a search anyway.
| troupo wrote:
| All you need is a tool that actually understands the
| language.
|
| It's 2024 and HN still suggests using regular expressions
| to search through a code base.
| lukan wrote:
| Regex is a universal tool.
|
| Your special tool might not work on plattform X, fails
| for edge case - and you generally don't know how it
| works. With regex or simple string search - I am in
| control. And can understand why results show up, or
| investigate when they don't, but should.
| troupo wrote:
| > Your special tool might not work on plattform X
|
| As always, people come out with the weirdest of excuses
| to not use actual tools in the 99.9999% of the cases when
| they are available, and work.
|
| When that tools doesn't work, or isn't sufficient, use
| another one like fuzzy text search or regexps.
|
| > and you generally don't know how it works.
|
| Do you know how your stove works? Or do you truly
| understand what the device you're typing this comment on
| truly works?
|
| Only in programming I see people deliberately avoid
| useful tools because <some fringe edge case that comes up
| once in a millenium in their daily work>
| lukan wrote:
| When you specialize in one thing only, do what you want.
|
| But I prefer tools, that I can use wherever I go. To not
| be dependant and chained to that environment.
|
| "Do you know how your stove works? Or do you truly
| understand what the device you're typing this comment on
| truly works?"
|
| Also yes, I do.
|
| " people deliberately avoid useful tools because <some
| fringe edge case that comes up once in a millenium in
| their daily work>"
|
| Well, or I did already changed tools often enough, to be
| fed up with it and rather invest in tech that does not
| loose its value in the next iteration of the innovation
| cycle.
| troupo wrote:
| > When you specialize in one thing only, do what you
| want.
|
| I specialize in one thing only: programming
|
| > But I prefer tools, that I can use wherever I go.
|
| Do you always walk everywhere, or do you use a tool
| available at the time, like cars, planes, bycicles,
| public transport?
|
| > rather invest in tech that does not loose its value in
| the next iteration of the innovation cycle.
|
| Things like "fund symbol", "find usages", "find
| implementation" have been available in actual tools for
| close to two decades now.
| lukan wrote:
| I did not say I do not use what is avaiable, but this
| debate is about in general having your code in a shape
| that simply searching for strings work.
| troupo wrote:
| Simply searching for strings rarely works well as the
| codebase grows larger. Because besides knowing where all
| things named X are, you want to actually see where X is
| used, or where it's called from, or where it is defined.
|
| With search you end up grepping the code twice:
|
| - first grepping for the name
|
| We're literally in a thread where people invent regexes
| for how to search the same thing (a function) defined in
| two different ways (as a function or as a const)
|
| - secondly, manually grepping through search results
| deducing if it's relevant to what you're looking for
|
| It becomes significantly worse if you want to include
| third-party libs in your search.
|
| There are countless times when I would just
| Cmd+B/Cmd+Click a symbol in IDEA and continue my
| exploration down to Java's own libraries. There are next
| to zero cases when IDEA would fail to recognise a
| function and find its usages if it was defined as a
| const, not as a function. Why would I willingly deny
| myself these tools as so many in this thread do?
| kragen wrote:
| if you think anything works in 99.9999% of cases, you've
| never programmed a computer
| wruza wrote:
| It's you who sees it as excuses. If I have a screwdriver
| multitool, I don't need another one which is for d10
| only. It simply creates _unnecessary_ clutter in a
| toolbox. The difference between definition and mention
| search for a function is: gr<bs><bs>ion
| name<cr> vs grname<cr>
|
| or for the current identifier, simply
| gr<m-w><cr>
|
| I could even make my own useful tools like "\\[fvm]gr"
| for function, variable or field search and brag about it
| watching miserable ide guys from the high balcony, but
| ain't that unnecessary as well.
| troupo wrote:
| > It simply creates unnecessary clutter in a toolbox.
|
| And then you proceed to... invent several pale imitations
| of a symbol/usages search.
|
| More here: https://news.ycombinator.com/item?id=41435862
| so as not to repeat myself
| wruza wrote:
| Doesn't really apply, ignores things just said.
| throwaway2037 wrote:
| Not to move the goal posts too much, but when I am
| searching a huge Python or Java code base from IntelliJ,
| I use a mixture of symbol and text search. One good thing
| about text search, you get hits from comments.
| troupo wrote:
| Yup. I do, too.
|
| I'm mostly ranting against this weird "we will never use
| great tools because full-text search" obsession
| renox wrote:
| The thing is: In large codebase the tool may become slow
| or crash, in a new language you may not have such tool..
| Grep is far more robust!
| troupo wrote:
| When tools don't work or unsuitable, you use different
| tools.
|
| And yet people are obsessed with never using useful tools
| in the first place because they can invent scenarios when
| this tool doesn't work. Even if these scenarios might
| never actually come up in their daily work.
| wruza wrote:
| Its current year and IDEs still can't remember how I just
| transformed the snippet of code and suggest to transform
| the rest of the files in the same way. All they can do in
| "refactor" menu is only "rename" and then some
| extract/etc nonsense which no one uses irl.
|
| By using regexps I have an experience that opens many
| doors, and the fact that they aren't automatic could make
| me sad, if only these doors weren't completely shut
| without that experience.
| troupo wrote:
| No one is stopping you from using regexps in IDEs.
|
| And you somehow manage to undersell the rename
| functionality in an IDE. And I've used move/extract
| functionality multiple times.
|
| I do however agree that applicable transformations (like
| upgrading to new syntaxes, or ways of doing stuff as
| languages evolve) could be applied wholesale to large
| chunks of code.
| spartanatreyu wrote:
| Yes JavaScript.
|
| You can search for both: "function" and "=>" to find all
| function expressions and arrow function expressions.
|
| All named functions are easily searchable.
|
| All anonymous functions are throw away functions that are
| only called in one place so you don't need to search for them
| in the first place.
|
| As soon as an anonymous function becomes important enough to
| receive a label (i.e. assigning it to a variable, being
| assigned to a parameter, converting to function expression),
| it has also become searchable by that label too.
| CGamesPlay wrote:
| The => is after the param spec, so you're searching for
| foo.*=> or something more complex, but then still missing
| multiline signatures. This is very easy to get caught by in
| TypeScript, and also happens when dealing with higher-order
| functions (quite common in React).
| spartanatreyu wrote:
| Why are you searching for foo. _= >
|
| Are you searching through every function, or functions
| that have a very specific parameter?
|
| And whatever you picked, why?
|
| ---------------------------------------------------------
| ------
|
| - If you're searching for every function, then there's no
| need to search for foo._=>, you only need to search for
| function and =>.
|
| - If you're searching for a specific parameter, then just
| search for the parameter. Searching for functions is
| redundant.
|
| ---------------------------------------------------------
| ------
|
| Arrow function expressions and function expressions can
| both be named or anonymous.
|
| Introducing arrow functions didn't suddenly make
| JavaScript unsearchable.
|
| JavaScript supported anonymous functions before arrow
| function expressions were introduced.
|
| Anonymous functions can only ever be:
|
| - run on the spot
|
| - thrown away
|
| - or passed around after they've been given a label
|
| Which means, whenever you actually want to search for
| something, it's going to be labelled.
|
| So search for the label.
| pjerem wrote:
| Yes but that's an anti pattern. Arrow functions aren't there
| to look cool, they're how you define lambdas / anonymous
| functions.
|
| Other than that, functions should be defined by the keyword.
| wiseowise wrote:
| How is that an anti-pattern?
|
| > Other than that, functions should be defined by the
| keyword.
|
| Says who?
| lukan wrote:
| All the wise ones. Well, except for you maybe.
|
| Serious arguments would be:
|
| - readability
|
| - greppability
| lukan wrote:
| (It wasn't an insult, but a joke on the username)
| tylerhou wrote:
| As of a few years ago (not sure about now) the backtrace
| frame info for anonymous functions were far worse than
| ones defined via the function keyword with a name.
| hansworst wrote:
| Anonymous functions don't have names. This makes it much
| harder to do things like profiling (just try to find that
| one specific arrow function in your performance profile
| flame graph) and tracing. Tools like Sentry that
| automatically log stack traces when errors occur become
| much less useful if every function is anonymous.
| medstrom wrote:
| const foo = () => {}
|
| This function is not anonymous, it's called foo.
| BlarfMcFlarf wrote:
| Does the function know it's called foo for tracing/error
| logging/etc?
| croes wrote:
| But to call foo in bar you must define foo before bar.
|
| function foo(){} is also callable if bar is defined
| before foo.
| sestep wrote:
| Not true at the top-level.
| wruza wrote:
| Not sure what you find not true about it. All named
| "function"s get hoisted just like "var"s, I use post-
| definitions of utility functions all the time in file
| scopes, function scopes, after return statements,
| everywhere. You're probably thinking about
| const foo = function (){}
|
| without its own name before (). These behave like
| expressions and cannot be hoisted.
| MetaWhirledPeas wrote:
| > I use post-definitions of utility functions all the
| time in file scopes, function scopes, after return
| statements, everywhere
|
| I haven't figured out if people consider this a best
| practice, but I love doing it. To me the list of called
| functions is a high-level explanation of the code, and
| listing all the definitions first just buries the high-
| level logic "below the fold". Immediately diving into
| function contents outside of their broader context is
| confusing to me.
| wruza wrote:
| I don't monitor "best" practices, so beware. But in
| languages like C and Pascal I also had a habit of simply
| declaring all interfaces at the top and then grouping
| implementations reasonably. It also created a nice
| "index" of what's in the file.
|
| Hoisting also enables cross-imports without helper unit
| extraction headaches. Many hate js/ts at the "kids hate
| == and null" level but in reality these languages have a
| very practical design that wins so many rounds irl.
| mrighele wrote:
| Interesting, it seems that the javascript runtime is
| smart enough detect this pattern and actually create a
| named function (I tried Chrome and Node.js)
| const foo = () => {} console.log( foo.name );
|
| actually outputs 'foo', and not the empty string that I
| was expecting. const test = () => ( ()
| => {} ); const foo = test(); console.log(
| foo.name );
|
| outputs the empty string.
|
| Is this behavior required by the standard ?
| svieira wrote:
| Yes, in great detail.
| https://tc39.es/ecma262/multipage/ordinary-and-exotic-
| object... is the specification, and for the TL;DR
| https://developer.mozilla.org/en-
| US/docs/Web/JavaScript/Refe... is pretty good.
| Izkata wrote:
| You're probably remembering how it used to work. This is
| the example I remember from way back that we shouldn't
| use because (aside from being unnecessary and weird) this
| function wouldn't have a name in stack traces:
| var foo = function() {};
|
| Except nowadays it too does have the name "foo".
| mapcars wrote:
| Not really, its an anonymous function stored in a
| variable foo
| mostlylikeable wrote:
| To me, arrow functions behave more like I would expect
| functions to behave. They don't include all the magic
| bindings that the function keyword imparts. Feels more
| "pure" to me. Anonymous functions can be either function
| () {} or () => {}
| bobbylarrybobby wrote:
| Functions and arrow functions have an important difference:
| arrow functions do not create their own `this`. If you're
| in a context where a nested function needs to maintain
| access to the outer function's `this`, and you don't want
| to muck with `bind` or `call`, then you _need_ an arrow
| function.
| albedoa wrote:
| I want to talk to the developer who considers greppability
| when deciding whether to use the "function" keyword but
| requires his definitions to be greppable by distancing them
| from their call locations. I just have a few questions for
| him.
| Pxtl wrote:
| You can still look for `(funcname)\s*=` can't you? I mean
| it's not like functions get re-declared a lot.
| bionsystem wrote:
| Doesn't cscope fit this usecase ?
| eddieh wrote:
| I used to define functions as `funcname (arglist)`
|
| And always call the function as `funcname(args)`
|
| So definitions have a space between the name and arg
| parentheses, while calls do not. Seemed to work well, even in
| languages with extraneous keywords before definitions since
| space + paren is shorter than most keywords.
|
| Now days I don't bother since it really isn't that useful
| especially with tags or LSP.
|
| I still put the return type on a line of its own, not for
| search/grep, but because it is cleaner and looks nice to me--
| overly long lines are the ugliest of coding IMO. Well that and
| excessive nesting.
| semiinfinitely wrote:
| python also!
| jsjohnst wrote:
| Python is the only one mentioned that "actually works"
| without endless exceptions to the rule in the normal case.
| The ones mentioned (Rust/Javascript/Lisp/Go) all have
| specific syntax that is commonly enough used which makes it
| harder to search. Possible, absolutely, but still harder.
| zbentley wrote:
| I'd say Python works well at greppability because community
| conventions generally discourage concealing certain kinds
| of definitions (e.g. function definitions are usually "def
| whatever").
|
| However, that's just convention. Lots of modules do
| metaprogramming tricks that obscure greppability, which can
| be a pain. This is particularly acute when searching for
| code that is "import-time polymorphic"--that is, code which
| picks one of several implementations for a piece of
| functionality at import time at the module scope. That
| frequently ends up with some hanky-panky a la
| "exported_function_name = _implementation1 if
| platform_supported else _implementation2" at the module
| scope.
|
| While sometimes annoying, that type of thing is usually
| done for understandable reasons (picking an
| optimized/platform-supported implementation of an interface
| --think select or selectors in the stdlib, or any pypi
| implementation of filesystem monitoring using
| fsnotify/fanotify/kqueue/fsevents/ReadDirectoryChangesW).
| Additionally, good type annotations help with greppability,
| though they can't fully mitigate this issue.
|
| Much less defensible in Python is code that abuses
| locals/globals to indirect symbol access, or code that
| abuses star imports to provide interfaces/implementation
| switching.
|
| Those, fortunately, are rare, but the elephant in the "no
| greppability ever" room is not: getattr bullshit in OO code
| is so often utterly obscure, unnecessary and terrible. And
| it's distressingly common on PyPi. At first I thought this
| was Ruby's encouragement of method_missing in the bad old
| days bleeding into the Python community, but the number of
| programmers for whom getattr magic is catnip seems to be
| disproportionate to the number of folks with Ruby
| experience, and, more concerningly, seems to me to be
| growing over time.
| koito17 wrote:
| Golang has a similar property as a side-effect of the following
| design decision. ... the language has been
| designed to be easy to analyze and can be parsed without a
| symbol table
|
| Taken from https://go.dev/doc/faq
|
| The "top-level declarations" in source files are exactly:
| package, import, const, var, type, func. Nothing else. If
| you're searching for a function, it's always going to start
| with "func", even if it's an anonymous function. Searching for
| methods implemented by a struct similarly only needs one to
| know the "func" keyword and the name of the struct.
|
| Coming from a background of mostly Clojure, Common Lisp, and
| TypeScript, the "greppability" of Go code is by far the best I
| have seen.
|
| Of course, in any language, Go included, it's always better to
| rely on static analysis tools (like the IDE or LSP server) to
| find references, definitions, etc. But when searching code of
| some open source library, I always resort to ripgrep rather
| than setting up a development environment, unless I found
| something that I want to patch (which in case I set up the
| devlopment environment and rely on LSP instead of grep to
| discover definitions and references).
| eptcyka wrote:
| Golang gets zero points from me because function receivers
| are declared between func and the name of the function. God
| ai hate this design choice and boy am I glad I can use golsp.
| medstrom wrote:
| Is it just hard to get used to, or does it fundamentally
| make something more difficult?
| ljm wrote:
| Can't say I've ever had an issue with it, but it does get
| a bit wild when you have a function signature that takes
| a function and returns one, unless you clear it up with
| some types. func (s *Recv) foo(fn func(x
| any) err) func bar(y any) (*Recv, err)
|
| As an exaggerated example. Easy to parse but not always
| easy to read at a glance.
| kragen wrote:
| this thread is about using `grep` to find things, and
| this subthread is specifically about how the `func`
| keyword in golang makes it easy to distinguish the
| definition of a function from its uses, so _yes_ ,
| because `grep 'func lart('` will not find definitions of
| `lart` as a method. you might end up with something like
| `grep 'func .*) *lart('` which is both imprecise and
| enough noise that you will not want to type it; you'll
| have to can it in a script, with the associated losses of
| flexibility and transparency
| medstrom wrote:
| That's fair, I see many examples in this thread where
| people pass an exact string directly to grep, as you do.
| I'm an avid grepper, but my grep tool [1] translates
| spaces to ".*?", so I would just type "func lart(" in
| that example and it would work.
|
| An incremental grep tool with just this one
| transformation rule gets you a lot more mileage out of
| grep.
|
| [1] https://github.com/minad/consult/blob/screenshots/con
| sult-li...
|
| EDIT: Better demo
| https://jumpshare.com/s/zMENBSr2LwwauJVjo1wS
| kragen wrote:
| that's going to find all the functions that take an
| argument named lart or of a lart type too, but it also
| sounds like a thing i really want to try
| vitus wrote:
| Also, anything that contains "func" and "lart" as a
| substring, e.g. foobar(function), blart(baz).
|
| It's not far off from my manually-constructed patterns
| when I want to make sure I find a function definition
| (and am willing to tolerate some false positives), but I
| personally prefer fine-grained control over when it's in
| use.
| medstrom wrote:
| Mmh, I type "func\ lart(" when I need the literal string.
| But it's less often, so it's fair that it's slightly more
| to type.
| kragen wrote:
| yeah!
| alienchow wrote:
| For functions: grep -P '^func funcName\\('
|
| For methods: grep -P '^func [^)]+\\) methodName\\('
|
| Hope that helps.
| kragen wrote:
| thanks, that definitely looks better! pcre is kind of
| working against you here tho; i assume you're invoking it
| out of habit
| alienchow wrote:
| I am actually unaware of downsides of PCRE. Could you
| explain? I hardly ever use literal grep nowadays.
| kragen wrote:
| oh, i mean that instead of grep -P
| '^func [^)]+\) methodName\('
|
| you could say grep 'func [^)]*)
| methodName('
|
| which is a bit less typing
|
| however, i have to admit that i sort of ensnared myself
| in my own noose here by being too clever! i forgot that
| grep's regexp dialect only supports + if you \ it, and it
| took me six tries to figure out why i wasn't getting any
| grep hits. there's a lot to be said for the
| predictability and consistency of pcre!
| kiitos wrote:
| No need for any func, you can just grep
| '\) methodName\('
| kragen wrote:
| grep: Unmatched ) or \)
|
| but your main point might be right; the few non-method
| matches to (pcre) '\\)\s*\w+\s*\\(' in /usr/share/go-1.19
| seem to be uncommon things like this:
| static void __attribute__ ((constructor)) sigsetup(void)
| { void poison() __attribute__ ((weak));
| C3 = -(R + I) // ADD(5,6) NEG(-5,-6)
| eptcyka wrote:
| I have to always add wildcards between func and the
| function name, because I can never know how the other
| developer has decided to specify the name of the
| receiver. This will always be a problem as far as
| grepping with primitive tools that don't parse the
| language.
| medstrom wrote:
| FYI, many people use thin wrappers like this, it's still
| a primitive tool that doesn't parse the language, but it
| can handle that problem:
| https://jumpshare.com/s/zMENBSr2LwwauJVjo1wS (GIF)
| eptcyka wrote:
| On machines where I control the tooling, this is not an
| issue. But I can't take my config to my colleagues
| machine.
| lanstin wrote:
| If only AFS had succeeded. What would a modern version of
| this look like?
| sethammons wrote:
| I search ") myFunc" to find member functions. It would be
| nice to search "c myFunc", but a parentheses works
| executesorder66 wrote:
| How many God AI's have expressed their hate for this
| design? /s
| kazinator wrote:
| Receivers are utterly idiotic. Like how could anyone with
| two working brain cells sign off on something like that?
|
| If you don't want OOP in the language, but want people to
| be able to write thing.function(arg), you just make
| function(thing, arg) and thing.function(arg) equivalent
| syntax.
| Pxtl wrote:
| C# did this for extension methods and it Just Works. You
| just add the "this" keyword to a function in a pure-
| static class and you get method-like calling on the first
| param of that function.
| kazinator wrote:
| If the function has to be modified in any way in order to
| grant permission to be used that way, then it is not
| quite "did this".
|
| Equivalent means that there is no difference at the AST
| level between o.f(a) and f(o, a), like there is no
| difference in C among _(a + i), a[i], i[a] and_ (i + a).
|
| However, a this keyword is way better than making the
| programmers fraction off a parameter and move it to the
| other side of the function name.
| tczMUFlmoNk wrote:
| A search term here is "Uniform Function Call Syntax", as
| present in (e.g.) D:
|
| https://en.wikipedia.org/wiki/Uniform_Function_Call_Synta
| x
| alienchow wrote:
| I very rarely use literal grep at all. Perl grep is my
| standard goto.
|
| For functions: grep -P '^func funcName\\('
|
| For methods: grep -P '^func [^)]+\\) methodName\\('
| aadhavans wrote:
| What about piped grep statements?
|
| grep func | grep functionName
| madeofpalk wrote:
| The culture of single letter variables in golang, at least in
| the codebases I've seen, undoes this.
| VonGallifrey wrote:
| The way I have seen this is that single letter variables
| are mostly used when declaration and (all) usages are very
| close together.
|
| If I see a loop with i or k, v then I can be fairly
| confident that those are an Index or a Key Value pair. Also
| I probably don't need to grep them since everything
| interacting with these variables is probably already on my
| screen.
|
| Everything that has a wider scope or which would be unclear
| with a single letter is named with a more descriptive name.
|
| Of course this is highly dependent on the people you work
| with, but this is the way it works on projects I have
| worked on.
| lelanthran wrote:
| > The culture of single letter variables in golang, at
| least in the codebases I've seen, undoes this.
|
| The convention, not just in Go, is that the smaller the
| scope, the smaller the variable reference.
|
| So, sure, you're going to see single-letter variables in
| short functions, inside short block scopes, etc, but that
| is true of almost any language.
|
| I haven't seen single-letter variables in Go that are in a
| scope that _isn 't_ short.
|
| Of course, this could just mean that I haven't seen enough
| of other peoples Go source.
| iudqnolq wrote:
| I like using l for logger and db for database
| client/pool/handle even if there's a wider scope. And if
| the bulk of a file is interacting with a single client I
| might call that c.
| marcosdumay wrote:
| > that is true of almost any language
|
| You'd be surprised how often language-local cultures
| break that rule on either side. And a few times it's even
| an improvement.
| lanstin wrote:
| Zipf's law, right - these rules are a formalization of
| our brain's functionality with language.
|
| Of course, with enough code, someone does everything.
| kazinator wrote:
| E.g. food and art are very important in Japan, so stomach
| is _i_ and a drawing /painting is _e_.
| BrandoElFollito wrote:
| Food is very very important in France so we call it
| _nourriture_ :)
| alienchow wrote:
| Single letter variables in Golang are to be used in small,
| local contexts. Akin to the throwaway i var in for loops.
| You only grep the struct methods, the same way no one greps
| 'this' or 'self'.
|
| The code bases you've been reading, and even some of the
| native libraries, don't do it properly. Probably due to
| legacy reasons that wouldn't pass readability approvals
| nowadays.
| vitus wrote:
| I'm not so sure about greppability in the context of Go. At
| least at Google (where Go originates, and whose style guide
| presumably has strong influence on other organizations' use
| of the language), we discourage "stuttering":
|
| > A piece of Go source code should avoid unnecessary
| repetition. One common source of this is repetitive names,
| which often include unnecessary words or repeat their context
| or type. Code itself can also be unnecessarily repetitive if
| the same or a similar code segment appears multiple times in
| close proximity.
|
| https://google.github.io/styleguide/go/decisions#repetitive-.
| ..
|
| (see also https://google.github.io/styleguide/go/best-
| practices#avoid-...)
|
| This is the style rule that motivates the sibling comment
| about method names being split between method and receiver,
| for what it's worth.
|
| I don't think this use case has received much attention
| internally, since it's fairly rare at Google to use grep
| directly to navigate code. As you suggest, it's much more
| common to either use your IDE with LSP integration, or Code
| Search (which you can get a sense of via Chromium's public
| repository, e.g. https://source.chromium.org/search?q=v8&sq=&
| ss=chromium%2Fch...).
| klodolph wrote:
| The thing about stuttering is that the first part of the
| name is fixed anyway, MOST of the time.
|
| If you want to search for `url.Parse`, you can find most of
| the usages just by searching for `url.Parse`, because the
| package will generally be imported as `url` (and you won't
| import Parse into your namespace).
|
| It's not as good as find references via LSP but it is like
| 99% accurate and works with just grep.
| vitus wrote:
| That works somewhat for finding uses of a native Go
| module, although I've seen lots of cases where the module
| name is autogenerated and so you need to import with an
| alias (protobufs, I'm looking at you). It also doesn't
| work quite so well in reverse -- you need to find files
| with `package url;`, then find instances of '\bfunc
| Parse\\('.
|
| Lastly, one of the bigger issues is (as aforementioned
| sibling commenter mentioned) the application of this
| principle to methods. This is especially bad with method
| names for established interfaces, like Write or String.
| Even with a LSP server, you then need to trace up the
| call stack to figure out what concrete types the function
| is being called with, then look at the definitions of
| those types. I can't imagine wanting to do that with only
| (rip)grep at my disposal.
| alexvitkov wrote:
| Wild that the people who made the language that forces me
| to type if err != nil { return
| nil, err }
|
| after every second statement are advising against code
| repetition.
| kiitos wrote:
| Wild that you don't distinguish between error handling
| and name repetition.
| alexvitkov wrote:
| The issue is about "repetition" - a concept that exists
| on the syntactic level, and is unavoidable in Go.
|
| To put it in a way you may understand, it's when you have
| to press keys in the same order a lot of times and you
| get sad.
| remram wrote:
| In golang you get `func (someName someType) funcname`, so
| it's much less greppable than languages using `func funcname`
| tuetuopay wrote:
| Go is horrible due to the absence of specific "interface
| implementation" markers. Gets pretty hard to find where or
| how a type implements an interface.
| kiitos wrote:
| Interfaces are consumer contracts, they say "I need this"
| not "I provide this".
|
| Regardless, any reasonable LSP will let you "Find
| implementations" of any interface definition.
| bryanrasmussen wrote:
| JavaScript has multiple ways to define a function so you sort
| of lose that getting the actual definition benefit.
|
| on edit: I see someone discussed that you can grep for both
| arrow functions and named function at the same time and I
| suppose you can also construct a query that handles a function
| constructor as well - but this does not really handle curried
| functions or similar patterns - I guess at that point one is
| letting the perfect become the enemy of the good.
|
| Most people grepping know the code base and the patterns in
| use, so they probably only need to grep for one type of
| function declaration.
| veltas wrote:
| For most functions ^\S.*name( will find declarations and
| definitions.
|
| Most of us use exuberant ctags to allow jumping to definitions.
| skywal_l wrote:
| Yet you reply to an article that defines functions as
| variables, which I've seen a lot of developers do usually for
| no good reason at all.
|
| To me, that's a much common and worse practice with regards to
| greppability than splitting identifiers using string which I
| haven't seen much in the wild.
| kazinator wrote:
| C has "classical" tooling like Cscope and Exuberant Ctags. The
| stuff works very well, except on the odd weird code that does
| idiotic things that should not be done with preprocessing.
|
| Even for Lisp, you don't want to be grepping, or at least not
| all the time for basic things.
|
| For TXR Lisp, I provide a program that will scan code and build
| (or add to) your tags file (either a Vim or Emacs compatible
| one).
|
| Given (defstruct point () x y)
|
| it will let your editor jump to the definition of point, x and
| y.
| dan-robertson wrote:
| Not sure this is very true for Common Lisp. Classic example are
| accessor functions where the generic function is created by
| whichever class is defined first and the method where the class
| is defined. Other macros will construct new symbols for
| function names (or take them from the macro arguments).
| f1shy wrote:
| Still you can extend the concept without a lot of work,
| couldn't you?
| db48x wrote:
| That's true, but I regard it as fairly minor. Accessor
| functions don't have any logic in them, so in practice you
| don't have to grep for them. But it can be confusing for new
| players, since they don't know ahead of time which ones are
| accessors and which are not.
| hgomersall wrote:
| Though glob imports in rust can hide a source, so those should
| be avoided.
| mre wrote:
| Exactly. I wrote an entire blog post about that:
| https://corrode.dev/blog/dont-use-preludes-and-globs/
| zarzavat wrote:
| C is so much worse than that. Many people declare symbols using
| macros for various reasons, so you end up with things like
| DEFINE_FUNCTION(foo) {. In order to get a complete list of
| symbols you need to preprocess it, this requires knowing what
| the compiler flags are. Nobody really knows what their compiler
| flags are because they are hidden between multiple levels of
| indirection and a variety of build systems.
| skissane wrote:
| > C is so much worse than that. Many people declare symbols
| using macros for various reasons, so you end up with things
| like DEFINE_FUNCTION(foo) {.
|
| That's not really C; that's a C-based DSL. The same problem
| exists with Lisp, except even worse, since its preprocessor
| is much more powerful, and hence encourages DSL-creation much
| more than C does. But in fact, it can happen with any
| language - even if a language lacks any built-in processor or
| macro facility, you can always build a custom one, or use a
| general purpose macro processor such as M4.
|
| If you are creating a DSL, you need to create custom tooling
| to go along with it - ideal scenario, your tools are so
| customisable that supporting a DSL is more about
| configuration than coding something from scratch.
| kragen wrote:
| the issue is that the c preprocessor is always available
| and usually used
| skissane wrote:
| Other languages have preprocessors or macro facilities
| too.
|
| C's is very weak. Languages with more powerful
| preprocessors/macros than C's include many Lisp dialects,
| Rust, and PL/I. If you think everyone using a weak
| preprocessor is bad, wait until you see what people will
| do when you give them a powerful one.
|
| Microfocus COBOL has an API for writing custom COBOL
| preprocessors in COBOL (the Integrated Preprocessor
| Interface). (Or some other language, if you insist.) I
| bet there are some bizarre abominations hidden in the
| bowels of various enterprises based on that ("our
| business doesn't just run on COBOL, it runs on our own
| custom dialect of COBOL!")
| kragen wrote:
| c's macro system is weak on purpose, based on, i suspect,
| bad experiences with m6 and m4. i think they thought it
| was easier to debug things like ratfor, tmg, lex, and
| (much later) protoc, which generate code in a more
| imperative paradigm for which their existing debugging
| approaches worked
|
| i can't say i think they were wholly wrong; paging
| through compiler error messages is not my favorite part
| of c++ templates. but i have a certain amount of
| affection for what used to be called gasp, the gas macro
| system, which i've programmed for example to compute jump
| offsets for compiling a custom bytecode. and i think m4
| is really a pathological case; most hairy macro systems
| aren't even 10% as bad as m4, due to a combination of
| several tempting but wrong design decisions. lots of
| trauma resulted
|
| so when they got a do-over they eliminated the
| preprocessor entirely in golang, and compensated with
| reflection, which makes debugging easier rather than
| harder
|
| probably old hat to you, but i just learned last month
| how to use x-macros in the c preprocessor to
| automatically generate serialization and deserialization
| code for record types (speaking of cobol):
| http://canonical.org/~kragen/sw/dev3/binmsg_cpp.c (aha, i
| see you're linking to a page that documents it)
| skissane wrote:
| C's is weak yet not weak - you can do various advanced
| things (like conditional expansion or iteration), but
| using esoteric voodoo with extreme performance cost.
| Whereas other preprocessors let you do that using
| builtins which are fast and easy to grok.
|
| See for example
| https://github.com/pfultz2/Cloak/wiki/C-Preprocessor-
| tricks,...
|
| Poor C preprocessor performance has a negative real world
| impact, for example recently with the Linux kernel -
| https://lwn.net/Articles/983965/ - a more powerful
| preprocessor would enable people to do those things they
| are doing anyway much more cheaply
| lanstin wrote:
| I've always suspected the powerful macro facilities in
| Lisp are why it's never been very common - the ability to
| do proper macros means all the very smart programmers
| create code that has to be read like a maths paper. It's
| too bespoke to the problem domain and too tempting to
| make it short rather than understandable.
|
| I like Rust (tho I have not yet programmed in it) but I
| think if people get too into macro generated code, there
| is a risk there to its uptake.
|
| It's hard for smart programmers to really believe this,
| but the old "if you write your code as cleverly as
| possible, you will not be able to debug it" is a useful
| warning.
| kazinator wrote:
| If your Lisp macro starts with a symbol whose name begins
| with _def_ , and the next symbol is a name, then good old
| Exuberant Ctags will index it, and you get jump to
| definition.
|
| Not so with DEFINE_FUNCTION(foo) {, I think.
| $ cat > foo.lisp (define-musical-scale g) $
| ctags foo.lisp $ grep scale tags g
| foo.lisp /^(define-musical-scale g)$/;" f
|
| Exuberant Ctags is not even a tool from the Lisp culture. I
| suspect it is mostly shunned by Lisp programmers. Except
| maybe for the Emacs one, which is different. (Same _ctags_
| command name, completely different software and tag file
| format.)
| db48x wrote:
| Yes, the usefulness of macros always has to be balanced
| against their cost. I know of only one codebase that does
| this particular thing though, Emacs. It is used to define
| Lisp functions that are implemented in C.
| shadowgovt wrote:
| It's a common pattern for just about any binding of
| C-implementation to a higher-level language. Python has a
| similar pattern, and I once had to re-invent it from
| scratch (not knowing any of this) for a game engine.
| akritid wrote:
| Looks fine (subjective) and there is also ctags
| sva_ wrote:
| People don't use LSP?
| gregjor wrote:
| That's right, not everyone uses an LSP. Nothing wrong with
| LSPs, very useful tools. I use ripgrep, or plain grep if I
| have to, far more often than an LSP.
|
| Working with legacy code -- the scenario the author describes
| -- I often can't install anything on the server.
| giancarlostoro wrote:
| Fun fact: VS Code uses ripgrep by default.
|
| https://github.com/microsoft/vscode-ripgrep
| menaerus wrote:
| LSP doesn't always work without issues with large C and C++
| codebases which is why one needs to fallback to grep
| techniques.
| gregjor wrote:
| ctags.
| mav3ri3k wrote:
| Although in rust, function like macros make it super hard to
| trace code. I like them when I am writing the code and hate
| then when I have to read others macros.
| throwawayffffas wrote:
| > Meanwhile C lacks any such keyword
|
| It's a hassle. But not the end of the world.
|
| I usually search for "doTheThing\\(.+?\\) \\{" first.
|
| If I don't get a hit, or too many hits I move to
| "doTheThing\\([^\\)]*?\\) \\{" and so on.
| jampekka wrote:
| Rust though does lose some of those points by more or less
| forcing[1] snake_case. It's really annoying to navigate
| bindings which are converted from camelCase.
|
| I don't care which case is used. It's a trivial superficial
| thing, and tribal zealotry about such doesn't reflect well on
| the language and community.
|
| [1] The warnings can be turned off, but in some cases it
| requires ugly hacks, and the community seems to be actively
| hostile to making it easier)
| kibwen wrote:
| The Rust community is no more zealous about naming
| conventions than any other language which has naming
| conventions. Perhaps you're arguing against the concept of
| naming conventions in general, but that's not a Rust thing,
| every language of the past 20 years suggests naming
| conventions if for no other reason than every language
| provides a standard library which needs to follow some sort
| of naming conventions itself. Turning off the warnings
| emitted by the Rust compiler takes two lines of code, either
| at the root of the crate or in the crate manifest.
| jampekka wrote:
| I've yet to encounter another compiler that warns about
| naming conventions, by default at least. So at least it's
| most enforced zealotry I've encountered.
|
| Yes, it can be turned off. But for e.g. bindgen generated
| code it was not trivial to find out.
| kibwen wrote:
| The Rust compiler doesn't produce warnings out of
| zealotry, but rather as a consequence of pre-1.0
| historical decisions. Note that Rust doesn't use any
| syntax in pattern matching contexts to distinguish
| between bindings and enum variants. In pre-1.0 versions
| of Rust, this created footguns where an author might
| think they were matching on an enum, but the compiler was
| actually parsing it as a catch-all binding that would
| cause any following match arms to never be executed. This
| was exacerbated by the prevailing naming conventions of
| the time (which you can see in this 2012 blog post:
| https://pcwalton.github.io/_posts/2012-06-03-maximally-
| minim... (note the lower-cased enum variants)). So at
| some point the naming conventions were changed in an
| attempt to prevent this footgun, and the lint was
| implemented to nudge people over to the new conventions.
| However, as time went on the footgun was otherwise fixed
| by instead causing the compiler to prioritize parsing
| enum variants rather than bindings, in conjunction with
| other errors and warnings about non-exhaustive patterns
| and dead code (which are all desirable in their own
| right). At this point it's mostly just vestigial, and I
| highly doubt that anybody really cares about it beyond
| "our users are accustomed to this warning-by-default, so
| they might be surprised if we stopped doing this".
| jampekka wrote:
| Ah, thanks for the info! I do think this default does
| have some ramifications, especially in that binding
| casings are typically changed due to it even for "non-
| native" wrappers which I find materially makes things
| more difficult.
| wruza wrote:
| _Meanwhile C lacks any such keyword, so the best you can do is
| search for the name. That gets you a sea of callers with the
| declarations and definitions mixed in_
|
| That's why in my personal projects I follow classic
| "type\nname" and grep with "^name\>".
|
| _looks ugly_
|
| Single line definitions with long, irregular type names and
| unaligned function names look ugly. Col 1 names are not only
| greppable but skimmable. I can speedscroll through code and
| still see where I am.
| akira2501 wrote:
| > so the best you can do is search for the name
|
| This is why in C projects libs go in "lib/" and sources go in
| "src/". If your header files have the same directory structure
| as libs, then "include/" is a also a decent way to find
| definitions.
| suprjami wrote:
| > Meanwhile C lacks any such keyword, so the best you can do
| is...
|
| ...use source code tagging or LSP.
| marcosdumay wrote:
| Those also make your language easier to parse, and to read.
|
| Many people insist that IDEs make the entire point moot, but
| that's the kind of thing that make IDEs easier to write and
| debug, so I disagree.
| johannes1234321 wrote:
| One thing which works for C is to search something like `[a-z]
| foo\\(.+\\) \\{` assuming that spacing matches the coding
| style, often the shorter form `[a-z] foo\\(` works well, which
| tries to ensure there is a type definition and bin assignment
| or something before name. Then there is only a handful false
| positives.
| andersa wrote:
| Do people really use text search for this rather than an IDE
| that parses all of the code and knows exactly where each
| declaration is, able to instantly jump to them from a key press
| on any usage...? Wild.
| iamwil wrote:
| Yes. Not everyone uses or likes an IDE. Also, when you lean
| on an IDE for navigation, there is a tendency to write more
| complicated code, since it feels easy to navigate, you don't
| feel the pain.
| fsckboy wrote:
| C, starting with K&R, has all declarations and definitions on
| lines at the left margin, and little else. this is easy to grep
| for.
| wpollock wrote:
| In the bygone days of ctags, C function definitions included a
| space before opening parenthesis, while function calls never
| had that space. I have a hard time remembering that modern
| coding styles never have that space and my IDE complains about
| it. (AFAIK, the modern gtags doesn't rely on that space to
| determine definitions.) Even without *tags, the convention made
| it easy to grep for definitions.
| mzs wrote:
| space after builtin was recommended instead:
| if (x == 0) { ... sizeof (buf); return (-1);
| exit(0);
| drewg123 wrote:
| In terms of C, that's one reason I prefer the BSD coding style:
|
| int
|
| foo(void) { }
|
| vs the Linux coding style:
|
| int foo(void) { }
|
| The BSD style allows me to find function definitions using git
| grep ^foo.
| leogout wrote:
| Javascript is a bit trickier i think nowadays with the fat
| arrow notation : const myFunc = () => console. log("can't find
| me :p");
| darepublic wrote:
| There is arrow syntax with js
| recursivecaveat wrote:
| A small, but underappreciated benefit of grammar changes like
| from the form `mytype myfun()` to `keyword myfun() sigil mytype`
| binary132 wrote:
| This can even be as simple as not using multi-line error strings,
| or expanding variables in them.
| mashlol wrote:
| Greppable commit messages and descriptions are also important,
| for a similar reason. If you want to learn where a feature exists
| in the codebase, searching the commits for where it was added is
| often easier than trying to grep through the codebase to find it.
| Once you've found the commit or even a nearby commit, it's much
| easier to find the rest.
| WalterBright wrote:
| That's why D has a cast keyword: ubyte c =
| cast(ubyte)i;
|
| instead of: unsigned char c = (unsigned char)i;
|
| Casts are a blunt instrument that subvert the type system, and so
| they need to be greppable.
|
| Having the cast keyword also removes the grammatical ambiguities
| in the expression syntax.
| jenadine wrote:
| Do you often grep for casts? I never do that.
| aa-jv wrote:
| Try to think about why you might want to do that. It makes a
| lot of sense, but if you're not doing it, that might be
| enlightening...
| WalterBright wrote:
| I regard every cast as a bug in my own code and try to
| refactor it so there aren't any. I can't get rid of all of
| them, but they're always worth a second look.
|
| I don't normally grep for them, but others have told me they
| did.
|
| P.S. one thing about D is you can do things like this:
| ubyte b = i; // error, losing bits ubyte b
| = cast(ubyte)i; // ugly cast ubyte b = i & 0xFF;
| // no cast, no error!
|
| It's just one of the nice little details that making
| programming in D a pleasure.
| Too wrote:
| These things are better checked automatically with a static
| linter, that presumably already has a full parser and AST
| representation of the language and knows what is a cast and
| not.
|
| Nobody should keep a checklist of "100 things to grep for
| when doing a code review".
| SnowflakeOnIce wrote:
| When doing appsec review in C or C++, yes!
| EasyMark wrote:
| Honestly I know I don't do that either. I mean if there was
| some special case where I remembered "oh yeah I had to cast
| that variable in this special case". In general I avoid
| casting as much as I can in C/C++, but especially in C.
| jackphilson wrote:
| I wonder - why isn't this talked about more? We have had tens of
| thousands of software companies, each with probably a dozen
| people focused on hyperoptimizing everything. Why hasn't this
| point been talked about more on the internet to the point where
| it's obvious today? And it's not specifically about this, it's
| more in general. Do people just learn this on their own, and not
| say anything? Or is the discussion related to this topic buried
| in some old forum somewhere?
| mrkeen wrote:
| It's talked about, just in the opposite direction.
|
| I've left hardcoded strings (think Kafka event type names) in
| my source for this very reason, but after a round of code
| review they get squirreled away as constants in separate files
| because string repetition is bad or something.
| jimmaswell wrote:
| Without constants, it's too easy to let a typo sneak in or
| have inconvenience later replacing one "event" but not
| replacing an unrelated "event". I'll only do it if the string
| is used two times at most, but usually I'll make a constant
| the first time and it doesn't feel like any loss.
| mrkeen wrote:
| Yes, this is exactly what I was fighting against.
|
| If I have three classes that interact with "MyTable", then
| I can grep for places that interact with "MyTable" and I
| get back three classes.
|
| After refactoring, the class which now knows about
| "MyTable" is Constants.java, which has no business knowing
| about "MyTable". Grepping it now turns up a false-positive
| and finds 0 of the actual usage sites (3 false-negatives).
| GeneralMayhem wrote:
| Sure, but now you have the string constant as a symbol,
| which you can either grep for (in which case you're
| delayed by one search, not the end of the world if you
| were going to unwind callstacks anyway) or, if you have
| an LSP, you can jump directly from it to users...
| NotMichaelBay wrote:
| It's not exactly a false positive. It's just a level of
| indirection, 1 more search by the constant name to find
| usages. What you sacrifice there you gain by having the
| compiler help find typos and the IDE help with
| autocompletion.
| philipwhiuk wrote:
| `Constants.java` is a massive code-smell (which I have in
| many projects, but it's still a smell).
|
| The file name is awful.
|
| At worst it should be 'DbConstants' but probably they
| should be defined elsewhere.
| amingilani wrote:
| I agree that code searchability is a good thing but I disagree
| with those examples. They intentionally increase the chance of
| errors.
|
| Maybe there's an alternative way to achieve what the author set
| out but increasing searchability at the cost of increasing
| brittleness isn't it for me.
|
| In this example:
|
| const getTableName = (addressType: 'shipping' | 'billing') => {
| return `${addressType}_addresses` }
|
| The input string and output are coupled. If you add string
| conditionals as the author did, you introduce the chance of a
| mismatch between the input and output.
|
| const getTableName = (addressType: 'shipping' | 'billing') => {
| if (addressType === 'shipping') { return 'shipping_addresses' }
| if (addressType === 'billing') { return 'billing_addresses' }
| throw new TypeError('addressType must be billing or shipping') }
|
| Similarly, flattening dictionaries for readability introduces the
| chance of a random typo making our lives hell. A single typo in
| the repetitions below will be awful.
|
| { "auth.login.title": "Login", "auth.login.emailLabel": "Email",
| "auth.login.passwordLabel": "Password", "auth.register.title":
| "Login", "auth.register.emailLabel": "Email",
| "auth.register.passwordLabel": "Password", }
|
| Typos aren't unlikely. In a codebase I work with, we have a
| perpetually open ticket about how ARTISTS is mistyped as ATRISTS
| in a similarly flat enum.
|
| The issue can't be solved easily because the enum is now copied
| across several codebases. But the ticket has a counter for the
| number of developers that independently discovered the bug and
| it's in the mid two digits.
| Noumenon72 wrote:
| Typos are find-and-fix-once, while unsearchability is a
| maintenance burden forever.
|
| I don't think coupling variable names by making sure they
| contain the same strings is the best way to show they're
| related, compared to an actual map from address type to table
| name. There might be a lot of things called 'shipping' in my
| app, only some of which are coupled to `shipping_addresses`.
|
| Shouldn't a linter be able to catch that there is no enum
| member called MyEnum.ATRISTS, or is it not an actual enum?
| ctxc wrote:
| Agree with you.
|
| What happens when translation files get too big and you want to
| split and send only relevant parts? Like send only auth keys
| when user is unauthenticated?
|
| `return translations[auth][login]` is no longer possible.
|
| Or just imagine you want to iterate through `auth` keys.
| _shudders_
| usrusr wrote:
| Entrenched typos like ATRISTS are actually a greppability
| goldmine. Chances are there are more occurrences of pluralized
| people who are making art in the codebase, but only ATRISTS is
| the one from that enum.
|
| I certainly would not suggest deliberately mistyping, but there
| are places where the benefit is approaching the cost. Certain
| log messages can absolutely benefit from subtle letter garbling
| that retains readability while adding uniqueness.
| kaelwd wrote:
| REFERER moment.
| peeters wrote:
| > The input string and output are coupled. If you add string
| conditionals as the author did, you introduce the chance of a
| mismatch between the input and output.
|
| I think it depends on whether the repetition is accidental or
| intrinsic. Does the table name _happen_ to contain the address
| type as a prefix, or does it intrinsically have to?
| Greppability aside, when things are incidentally related, it 's
| often better to repeat yourself to not give the wrong
| impression that they're intrinsically related. Conversely, if
| they _are_ intrinsically related (i.e. it 's an invariant of
| the system that the table name starts with the address type as
| a prefix) then it's better for the code to align with that.
| semiinfinitely wrote:
| why two 'p's - grep only has one
| wging wrote:
| English inserts an additional 'p' in some cases; for precedent
| consider "stoppable", "unflappable", "skippable".
|
| See https://english.stackexchange.com/questions/30001/why-is-
| shi...
| Hackbraten wrote:
| See also: https://en.wikipedia.org/wiki/Digraph_(orthography)
| #Double_l...
| mckn1ght wrote:
| That's usually how it's done with words that end in one
| consonant when adding a suffix that starts with a vowel, so as
| not to change the pronunciation of the short vowel in the root
| word due to english's rules around long and short vowels. See
| also map->mapped, bat->batted, tap->tappable etc
| loeg wrote:
| Yeah, this is also a benefit of e.g. C identifiers vs C++, where
| namespace, class, and method/variable can all be listed in
| separate places, breaking the ability to locate non-unique
| method/variable names with grep.
| x3n0ph3n3 wrote:
| > No AI tooling was used in the creation of this article.
|
| That was refreshing.
| jimmaswell wrote:
| It just doesn't have that genuine artisan smell to it when
| someone uses a printing press a computer with automatic
| typesetting spellcheck Wikipedia AI to help write their
| article.
| jongjong wrote:
| This is a great point. One of my pet peeves is seeing an error in
| the logs which I cannot find in the code for various reasons.
| Sometimes the error message is constructed in a complicated way
| with variables concatenated together or the error message is
| extremely generic and I get matches in 100 different places.
|
| I'm an advocate for the idea that any aspect of a system which
| communicates either with end users or with sysadmins should be
| given high exposure in the code base. Typically, this means
| constructing abstractions in such a way that higher-level
| business logic and log messages are easily traceable from a
| single file. I make it so that the business layer sits above all
| other layers, as close to the program's entry point as possible.
| rezaprima wrote:
| I had been bitten by ruby's metaprogramming on this.
| elijahbenizzy wrote:
| Heh, this was very much the design philosophy behind Hamilton
| (github.com/dagworks-inc/hamilton).
|
| The basic idea was that if you have a data artifact (columns for
| dataframes initially), you should be able to ctrl-f and find it
| in your codebase. 1:1 mapping of data -> function.
|
| People take a long time to figure out that the readability gains
| from having greppability is worth whatever verbosity that comes,
| largely because they think of code too much as a craft (make it
| as small/neat as possible) and not documentation for a live
| process...
| lucumo wrote:
| Grepping for symbols like function names and class names feels so
| anemic compared to using a tool that has a syntactic
| understanding of the code. Just "go to definition" and "find
| usages" alone reduce the need for text search enormously.
|
| For the past decade-plus I have mostly only searched for user
| facing strings. Those have the advantage of being longer, so are
| more easily searched.
|
| Honestly, posts like this sound like the author needs to invest
| some time in learning about better tools for his language. A good
| IDE alone will save you so much time.
| jakub_g wrote:
| Your observation does not help with the majority of the points
| in the article. How do you find all usages of a _parameter
| value literal_?
| troupo wrote:
| This is what the article starts with: "Even in projects
| exclusively written by myself, I have to search a lot:
| function names, error messages, class names, that kind of
| thing."
|
| All of that is trivial to search for with a tool that
| understands the language.
| renewiltord wrote:
| I actually don't think there's a tool that handles usages
| when using PHP varvars or when using example number one
| there which is parametrically choosing a table name.
|
| When you string interpolate to build the name you lose
| searchability.
| troupo wrote:
| Yes, full-text search is a great fallback when everything
| else fails. But in the use cases listed at the beginning
| of the article it's usually not needed if you have proper
| tools
| nosianu wrote:
| > _All of that is trivial to search for with a tool that
| understands the language._
|
| Isn't string search, or grepping for patterns, even more
| trivial? So what is your argument? You found an alternative
| method, good, but how is it any better?
|
| In my own case, I wrote a library that we used in many
| projects, and I often wanted to know where and how
| functions from my lib were used in those projects. For
| example, to be able to tell how much of an effort it would
| be for the users to refactor when I changed something.
| However, your method of choice at least with my IDE
| (Webstorm) only worked locally within the project. Only
| string search would let me reliably and easily search all
| projects.
|
| I actually experimented creating a "meta" project of all
| projects, but while it worked that lead to too many
| problems, and the main method to find anything still was
| string search (CTRL-SHIFT-F Find dialog in IDEA IDEs is
| string search and it's a wonderful dialog in that IDE
| family). I also had to open that meta project. Instead, I
| created a gitignored folder with symlinks to the sources of
| all the other projects and created a search scope for that
| folder, in which the search dialog let me string-search all
| projects' sources at once right from within the library
| project and still being able to use the excellent Find
| dialog.
|
| In addition, I found that sometimes the IDE would not find
| a usage even within the project. I only noticed because I
| used both methods, and string search showed me one or two
| places more than the method that relied on the underlying
| code-parsing. Unfortunately IDEs have bugs, and the method
| you suggests relies on much more work of the IDE in parsing
| and indexing compared to the much more mundane string or
| string pattern search.
| troupo wrote:
| > Isn't string search, or grepping for patterns, even
| more trivial?
|
| It's not trivial when you looking for _symbols_ in
| _context_.
|
| > the method you suggests relies on much more work of the
| IDE in parsing and indexing compared to
|
| ...compared to parsing and indexing you have to do
| manually because a full-text search (especially in a
| large codebase) will return a lot of irrelevant info?
|
| Funnily enough I also have a personal anecdote. We had a
| huge PHP code base based on Symfony. We were in the
| middle of a huge refactoring spree. I saw my colleagues
| switch from vim/emacs to Idea/WebStorm looking at how I
| easily found symbols in the code base, found their
| usages, refactored them etc. compared to the full-text
| search they were always stuck with.
|
| This was 5-6 years ago, before LSP became ubiquitous.
| nosianu wrote:
| > _It 's not trivial_
|
| Did you miss the comparison? The " _more_ trivial "? The
| context of my response? Please read the parent comment I
| responded to, treating my comment as standalone and
| adding some new meaning makers no sense.
|
| String search _is_ more trivial than a search that
| involves an interpretation of the code structure and
| meaning. I have no idea why you wish to start a
| discussion about such trivial statement.
|
| > * because a full-text search (especially in a large
| codebase) will return a lot of irrelevant info?*
|
| It doesn't do that for me but instead works very well. I
| don't know what you do with your symbol names, but I have
| barely any generic function names, the vast majority of
| them are pretty unique.
|
| No idea how you use search, but I'm never looking for
| "doSomething(", it's always "doSomethingVerySpecific()",
| or some equally specific string constant.
|
| I don't have the problems you tell me I should have, and
| _my_ use case was the subject of my comment, as should be
| clear, as well as my comment being a _response_ to a
| specific point made by the parent comment.
| cma wrote:
| > All of that is trivial to search for with a tool that
| understands the language.
|
| Some literal in a log message may come from the code or it
| might be remapped in some config file outside the language
| the LSP is looking at, or an environment variable etc.. I
| just go back and forth with grep and IDE tools, both have
| different tradeoffs.
| troupo wrote:
| The thing is, so many people are weirdly obsessed with
| never using any other tools besides full-text search. As
| if using useful tools somehow makes them a lesser
| programmer or something :)
| CrimsonRain wrote:
| By not using literals everywhere. All literals are defined
| somewhere (start of function, class etc) as enums or vars and
| used.
|
| Just because I have 20 usage of 'shipping_address' doesn't
| mean I'll have this string 20 times in different places.
|
| Grep has its place and I often need to grep code base which
| have been written without much thoughts towards DX. But
| writing it nicely allows LSP to take over.
| brain5ide wrote:
| I think the first sentence of the author counters your comment.
| What you described works best in a familiar codebase where the
| organizing principles have been maintained well and are
| familiar to the reader and the tools are just the extension of
| those organizing principles. Even then a deviation from those
| rules might produce gaps in understanding of what the codebase
| does.
|
| And grep cuts right through that in a pretty universal way.
| What the post describes are just ways to not work against grep
| to optimize for something ephemeral.
| ricardo81 wrote:
| Agree. Not just because it's unfamiliar code, you can also
| get a feel for how the program/programmer(s) structured the
| whole thing.
| zarzavat wrote:
| Go to definition and find usages only work one symbol at a
| time. I use both, but I still use global find/replace for
| groups of symbols sharing the same concept.
|
| For example if I want to rename all "Dog" (DogModel, DogView,
| DogController) symbols to "Wolf", find/replace is much better
| at that because it will tell me about symbols I had forgotten
| about.
| gugagore wrote:
| I am familiar with the situation you describe, and it's a
| good point.
|
| However, it does suggest that there is an opportunity for
| factoring "Dog" out in the code, at least by name spacing
| (e.g. Dog.Model).
| f1shy wrote:
| That really depends on the context, and specific situation.
| zarzavat wrote:
| That gets to the core of the issue doesn't it? There are
| two cultures: Do you prefer to refactor DogView into
| Dog.View, or do you prefer to refactor Dog.View into
| DogView.
|
| Personally I value uniqueness/canonicalness over
| conciseness. I would rather have DogView because then there
| is one name for the symbol regardless of where I am in the
| codebase. If the same symbol is used with differently
| qualified names it is confusing - I want the least
| qualified name to be more descriptive than "View".
|
| The other culture is to lean heavily on namespaces and to
| not worry about uniqueness. In this case you have View and
| Dog.View that may be used interchangeably in different
| files. This is the dominant culture in Java and C#.
| kccqzy wrote:
| The second culture that you describe happens also to be
| how OCaml structures things in modules. It's quite a
| turnoff for me.
| f1shy wrote:
| For that use case I think you can use treesitter[1] you can
| find Dog.* but only if it is a variable name, for example.
| Avoiding replacement inside of say literals.
|
| [1] https://www.youtube.com/watch?v=MZPR_SC9LzE
| turboponyy wrote:
| There's no reason they _have_ to work one symbol at a time -
| that 's just a missing feature in your language server
| implementation.
|
| Some language servers support modifying the symbols in
| contexts like docstrings as well.
| setopt wrote:
| I've never seen an LSP server that lets you rename "Dog" to
| "Wolf" where your actual class names are "Dog[A-Za-z]*"?
|
| Do you have an example?
| Maxion wrote:
| IntelliJ's refactor tool?
| yen223 wrote:
| IntelliJ doesn't use LSP as far as I know.
|
| It does usually make that kind of DogModel -> WolfModel
| refactoring.
| turboponyy wrote:
| Neither have I; and no, I don't - I misinterpreted what
| you said.
|
| But I don't see why LSP servers shouldn't support this,
| still. I'm not sure if the LSP specification allows for
| this as of current, though.
| setopt wrote:
| I would actually love a regexp search-and-replace
| assisted by either TreeSitter or LSP.
|
| Something that lets me say that I want to replace
| "Dog\\(.*\\)" with "Wolf\1", but where each substitution
| is performed only within single "symbols" as identified
| by TS or LSP.
| nickcox wrote:
| ast-grep supports regex [1] or pattern based [2] matching
| and replacement.
|
| [1] https://ast-grep.github.io/guide/rule-config/atomic-
| rule.htm... [2] https://ast-grep.github.io/guide/rule-
| config/atomic-rule.htm...
| sandermvanvliet wrote:
| Jetbrains ReSharper (and Rider) is smart enough to handle
| these things. It'll suggest renames across other symbols even
| ones that have related names
| heisenbit wrote:
| A good IDE can be so much better iff it understands the code.
| However this requires the IDE to be able to understand the
| project structure, dependencies etc. which can be considerable
| effort. In a codebase with many projects employing several
| different languages it becomes hard to get and maintain the IDE
| understands everything state.
| amichal wrote:
| And an IDE would also fail to find references for most of the
| cases described in the article: name
| composition/manipulation, naming consistency across language
| barriers, and flat namespaces in serialization. And file/path
| folder naming seems to be irrelevant to the smart IDE
| argument. "Naming things is hard"
| carlmr wrote:
| And especially in large monorepos anything that understands
| the code can become quite sluggish. While ripgrep remains
| fast.
|
| A kind of in-between I've found for some search and replace
| action is comby (https://comby.dev/). Having a matching
| braces feature is a godsend for doing some kind of
| replacements properly.
| underdeserver wrote:
| Unfortunately in larger codebases or dynamic languages these
| tools are just not good enough today. At least not those I and
| my employers have tried.
|
| They're either incomplete (you don't get ALL references or you
| get false references) or way too slow (>10 seconds when rg
| takes 1-2).
|
| Recommendations are most welcome.
| jimmaswell wrote:
| Only thing I can recommend is using C# (obviously not always
| possible). Never had an issue with these functions in Visual
| Studio proper no matter how big the project.
| aa-jv wrote:
| On the flipside, IDE's can turn you into lazy, inefficient
| programmers by doing all the hand-holding for you.
|
| If your feelings are anemic when tasked with doing a grep, its
| because you have lost a very valuable skill by delegating it to
| a computer. There are some things the IDE is never going to be
| able to find - lest it becomes the development environment - so
| keeping your grep fu sharpened is wise beyond the decades.
|
| (Disclaimer: 40 years of software development, and
| vim+cscope+grep/silversearcher are all I really need, next to
| my compiler..)
| HdS84 wrote:
| Huh? I have an old hand-powered drill from my Grandpa in my
| workshop. I used it once for fun. For all other tasks I use a
| powered drill. Same for IDEs. They help your refactor and
| reason about code - both properties I value. Sure, I could
| print it and use a textmarker, but I'm not Grandpa
| trashtester wrote:
| Knowing the bash ecosystem translates better to how you use
| the knife in the kitchen.
|
| Sure you can replace most uses of a knife with power tools,
| but there is a reason why most top chefs still rely on that
| knife for most of those tasks.
|
| A hand powered drill is more like a hand powered
| meatgrinder. It has the same limitation as the powered
| versions, and is simply a more primitive version.
| winwang wrote:
| I count the IDE and stuff like LSP as natural extensions of
| the compiler. For sure I grep (or equivalent) for stuff, but
| I highly prefer statically typed languages/ecosystems.
|
| At the end of the day, I'm here to solve problems, and
| there's no end to them -- might as well get a head start.
| high_na_euv wrote:
| Leveraging technology is good thing
| throwaway2037 wrote:
| > lazy... programmers
|
| Since when was that a bad thing? Since time immemorial, it
| has been hailed as a universal good for programmers to be
| lazy. I'm pretty sure Larry Wall has lots of jokes about this
| on Usenet.
|
| Also, I can clearly remember switching from vim/emacs to
| Microsoft Visual Studio (please, don't throw your tomatoes
| just yet!). I was blown away by IntelliSense. Suddenly, I was
| focusing more on writing business logic, and less time
| searching for APIs.
| trashtester wrote:
| This is the wrong type of lazy.
|
| Command line tools like grep are force multipliers for
| programmers. GUI's come with the risk of not being able to
| learn how to leverage this power. In the end, that often
| leads to more manual work.
|
| And today, bash is a lingua franca that you can bring with
| you almost everywhere. Even Windows "speaks" bash these
| days, with WSL.
|
| In itself, there's nothing wrong with using the built-in
| features of a GUI. Right-clicking a method (or using a
| keyboard shortcut) to find the definition in a given code
| base IS nice for that particular operation.
|
| But by knowing grep/awk/find/git command line and so on,
| combined with bash scripting and advanced regular
| expressions, you open up a new world of possibilities.
|
| All those things CAN be done using Python/C#/Java or
| whatever your language is. But a 1-liner in bash can be
| 10-100 lines of C#.
| lucumo wrote:
| Where does this stupid notion come from that using
| powerful tools means you can't handle the less powerful
| ones anymore? Did your skills with a hand screwdriver
| atrophy when you learned how to use a powered
| screwdriver? Come on.
|
| I use grep multiple times a day. I write bash scripts
| quite often. I'm not speaking from a position of
| ignorance of these tools. They have their place as a
| lowest common denominator of programming tools. But
| settling for the lowest common denominator is not a path
| to productivity.
|
| Doesn't mean you should forget your skills, but it does
| mean you should investigate better tools. And leverage
| them. A lot.
|
| > But a 1-liner in bash can be 10-100 lines of C#.
|
| Yes. And the reverse is also true. bash is fast and easy
| if there's an existing tool you can leverage, and slow
| and hard when there's not.
| lucumo wrote:
| > If your feelings are anemic
|
| I'm not _feeling_ anemic. The tool is anemic, as in,
| underpowered. It returns crap you don 't want, and doesn't
| return stuff you do want.
|
| My grep-fu is fine. It's a perfectly good tool if you have
| nothing better. But usually you do have something better.
|
| Using the wrong tool to make yourself feel cool is stupid.
| Using the wrong tool because a good tool could make you lazy
| shows a lack of respect for the end result.
| IshKebab wrote:
| Definitely true when you can use static typing.
|
| Unfortunately sometimes you can't, and sometimes you can but
| people can't be arsed, so this is still a consideration.
| a_e_k wrote:
| I've come to really like language servers for big personal and
| work projects where I already have my tools configured and
| tuned for efficiently working with it.
|
| But being able to grep is really nice when trying to figure out
| something out about a source tree that I don't yet have set up
| to compile, nor am I a developer of. I.e., I've downloaded the
| source for a tool I've been using pre-built binaries of and am
| now trying to trace why I might be getting a particular error.
| leni536 wrote:
| I can't use an IDE on my entire git history, but git can grep.
| citrin_ru wrote:
| Not everything you need to look for is a language identifier. I
| often grep for configuration option names in the code to see
| what the option actually does - sometimes it is easy to grep,
| sometimes there are too many matches, sometimes they cannot be
| found because option name composed in the code from separate
| unrepeatable (because of too many matches) parts. It's not hard
| to make config options greppable but some coders just don't
| care about this property.
| k__ wrote:
| Honestly, in my 18 years of software development, I haven't
| "greped" code once.
|
| I only use grep to filter the output of CLI tools.
|
| For code, I use my IDE or repository features.
| yCombLinks wrote:
| Do you use the find feature in your IDE? IE not find by
| reference, just text matching? That's the same as
| greppability.
| laserbeam wrote:
| Scenarios where an IDE with full syntactic understanding is
| better:
|
| - It's your day to day project and you expect to be working in
| it for a long time.
|
| Scenarios where grepping is more useful:
|
| - Your language has #ifdef or equivalent syntax which does
| conditional compilation making syntactic tools incomplete.
|
| - You just opened the project for the first time.
|
| - It's in a language you don't daily drive (you write backend
| but have to delve in frontend code, it's a 3rd party library,
| it's configuration files, random json/xml files or data)
|
| - You're editing or searching through documentation.
|
| - You haven't even downloaded the project and are checking
| things out in github (or some similar site for your project).
|
| - You're providing remote assistance to someone and you are not
| at your main development machine.
|
| - You're remoting via SSH and have access to code there (say
| it's a python server).
|
| Yes, an IDE will save you time daily driving. But there's no
| reason to sabotage all the other usecases.
| popinman322 wrote:
| Grep is also useful when IDE indexing isn't feasible for the
| entire project. At past employers I worked in monorepos where
| the sheer size of the index caused multiple seconds of delay
| in intellisense and UI stuttering; our devex team's preferred
| approach was to better integrate our IDE experience with the
| build system such that only symbols in scope of the module
| you were working on would be loaded. This was usually fine,
| and it works especially well for product teams, but it's a
| headache when you're doing cross-cutting work (e.g. for
| infrastructure projects/overhauls).
|
| We also had a livegrep instance that we could use to grep any
| corporate repo, regardless of where it was hosted. That was
| extremely useful for investigating failures in build scripts
| that spanned multiple repositories (e.g. building a Go
| sidecar that relies on a service config in the Java
| monorepo).
| cma wrote:
| If running into this, make sure to enable 64-bit
| intellisense and increase the ram limit, by default it is
| 4gb.
| Cieric wrote:
| As someone who runs into that daily, I'm surprised I
| never heard of this before.
|
| I seem to have found the 64-bit mode under "Tools >
| Options" then "Text Editor > C/C++ > IntelliSense". The
| top option is [] Enable 64-bit IntelliSense.
|
| But I can't seem to find the ram limit you mentioned and
| searching for it just keeps bringing up stuff related to
| vscode. Do you know where it is off the top of your head
| or a page that might describe it?
| samatman wrote:
| The RAM limit _is_ 32 bit Intellisense. 2^32 is 4GiB.
|
| Edit: I take that back, this was a first-principles
| comment. There's a setting 'C_Cpp: Intelli Sense Memory
| Limit' (space included).
| Cieric wrote:
| Thanks for that, while searching google for that result
| only lead me to vscode's IntelliSense settings. Searching
| for "Intelli Sense Memory Limit" setting in visual studio
| didn't lead me right to the result but it did give me a
| whole settings page that "matched". I found the setting
| in visual studio is "IntelliSense Process Memory Limit"
| which is under "Text Editor > C/C++ > Advanced" then
| under header "IntelliSense" towards the bottom of the
| section.
| emn13 wrote:
| Further important (to me) scenarios that also argue for
| greppability:
|
| - greppability does not preclude IDE or language server
| tooling; there's often special cases where only certain e.g.
| context-dependant usages matter, and sometimes grep is the
| easiest way to find those.
|
| - projects that include multiple languages, such as for
| instance the fairly common setup of HTML, JS, CSS, SQL, and
| some server-side language.
|
| - performance in scenarios with huge amounts of code, or
| where you're searching very often (e.g. in each git commit
| for some amount of history)
|
| - ease of use across repositories (e.g. a client app, a spec,
| and a server app in separate repos).
|
| I treat greppability as an almost universal default. I'd much
| rather have code in a "weird" naming style in some language
| but have consistent identifiers across languages, than have
| normal-style-guide default identifiers in each language, but
| differing identifiers across languages. If code "looks
| weird", if anything that's often actually a _benefit_ in such
| cases, not a downside - most serialization libraries I use
| for this kind of stuff tend to do a lot of automagic mapping
| that can break in ways that are sometimes hard to detect at
| compile time if somebody renames something, or sometimes even
| just for a casing change or type change. Having a hint as to
| this fragility immediate at a glance even in dynamically
| typed languages is sometimes a nice side-effect. Very
| speculatively, I wouldn't be surprised if AI coding tools can
| deal with consistent names better than context-dependent ones
| too; greppability is likely not specifically about merely the
| tool grep.
|
| And the best part is that there's almost no downside; it's
| not like you need to pick either a language server, IDE or
| grep - just use whatever is most convenient for each task.
| cxr wrote:
| - You're fully aware that it would be better to be able to
| use tooling for $THING, but tooling doesn't exist yet or is
| immature.
| kragen wrote:
| you would not believe the amount of time i spent pretty-
| printing python dicts by hand last week
| yen223 wrote:
| https://docs.python.org/3/library/pprint.html
| kragen wrote:
| yeah, pprint is why i was doing it by hand ;)
| lkbm wrote:
| I used to pipe things through black for that. (a script
| that imported black, not just black on the command line.)
|
| I also had `j2p` and `p2j` that would convert between
| python (formatted via black) and json (formatted via jq),
| and the `j2p_clip`/`p2j_clip` versions that would pipe
| from clipboard and back into clipboards.
|
| It's worth taking the time to build a few simple scripts
| for things you do a lot. I used to open up the repl and
| import json to convert between json and python dicts
| multiple times a day, so spending a few minutes throwing
| together a simple script to do it was well worth the
| effort.
| kragen wrote:
| part of what i ended up with was this:
| {'country': ['25', '32', '6', '37', '72', '22', '17',
| '39', '14', '10', '35', '43', '56',
| '36', '110', '11', '26', '12', '4', '5'],
| 'timeZone': '8', 'dateFrom': '2024-05-01', 'dateTo':
| '2024-05-30',
|
| black is the opposite extreme from what i wanted; https:/
| /black.readthedocs.io/en/stable/the_black_code_style/...
| explains:
|
| > _If a data structure literal (tuple, list, set, dict)
| or a line of "from" imports cannot fit in the allotted
| length, it's always split into one element per line._
|
| i'm not interested in minimizing diffs. i'm interested in
| being able to see all the fields of one record on one
| screen--moreover, i'd like to be able to see more than
| one record at a time so i can compare what's the same and
| what's different
|
| black seems to be designed for the kind of person who
| always eats at mcdonald's when they travel because they
| value predictability over quality
| pushfoo wrote:
| My understanding of black is that it solves bikeshedding
| by making everyone a little unhappy.
|
| For aligned column readability and other scenarios, _#
| fmt: off_ and _# fmt: on_ become crucial. The problem is
| that like _# type: ignore_ , those start spreading if
| you're not careful.
| kragen wrote:
| yeah; unless your coworkers are hindu, you can solve
| 'bikeshedding' about which restaurant to go to by going
| to mcdonald's, too
| sgarland wrote:
| My only complaint with black is that it only splits long
| definitions into per-line if they exceed a limit. That's
| probably configurable, now that I write it down.
|
| Other than that, I actually quite like its formatting
| choices.
| pushfoo wrote:
| Line length is definitely configurable. All it takes is
| adding the following on pyproject.toml[1]:
| [tool.black] line-length = 100
|
| Aside from matrix-like or column aligned data, the only
| truly awful thing I've encountered has been broken
| f-string handling[2].
|
| [1]: Example from https://github.com/pythonarcade/arcade/
| blob/808e1dafcf1da30f...
|
| [2]: https://github.com/psf/black/issues/4389
| lkbm wrote:
| Fair. I spent some time trying to figure out how to make
| it do roughly that before giving up.
| kragen wrote:
| i kind of get the vibe from the black documentation that
| it's written by the kind of person who thinks we're bad
| people for wanting that, and perhaps that everyone should
| wear the same uniform because vanity is sinful and
| aesthetics are frivolous
| pushfoo wrote:
| we keep having similar problems, lol.
| jollyllama wrote:
| >It's your day to day project and you expect to be working in
| it for a long time.
|
| Bold of everyone here to assume that everyone has a day to
| day project. If you're a consultant or for other reasons
| you're switching projects on a month to month basis,
| greppability is probably the top metric second to UT
| coverage.
| switchbak wrote:
| They said the scenario in which that would be useful was
| IF: "It's your day to day project and you expect to be
| working in it for a long time". The implication being that
| if neither of those hold then skip to the next section.
|
| I don't think anyone is assuming anything here. I've
| contracted for most of my career and this didn't seem like
| an outlandish statement.
|
| Also, if you're working in a project for a month, odds are
| you could set up an IDE in the first few hours. Not sure
| how any of this rises to the level of being "bold".
| lolinder wrote:
| > It's your day to day project and you expect to be working
| in it for a long time.
|
| I don't think we need to restrict the benefits quite that
| much--if it's a project that _isn 't_ my day-to-day but is in
| a language I already have set up in my IDE, I'd much prefer
| to open it up in my IDE and use jump to definition and
| friends than to try to grep and hope that the developers made
| it grepable.
|
| Going further, I'd equally rather have plugins ready to go
| for every language my company works in and use them for
| exploring a foreign codebase. The navigation tools all work
| more or less the same, so it's not like I need to invest
| effort learning a new tool in order to benefit from
| navigation.
|
| > Yes, an IDE will save you time daily driving. But there's
| no reason to sabotage all the other usecases.
|
| Certainly don't sabotage, but some of these suggestions are
| bad for other reasons that aren't about grep.
|
| For example: breaking the naming conventions of your language
| in order to avoid remapping is questionable at best.
| Operating like that binds your business logic way too tightly
| to the database representation, and while "just return the db
| object" sounds like a good optimization in theory, I've never
| not regretted having frontend code that assumes it's
| operating directly on database objects.
| kelnos wrote:
| > _if it 's a project that_ isn't _my day-to-day but is in
| a language I already have set up in my IDE, I 'd much
| prefer to open it up in my IDE and use jump to definition
| and friends than to try to grep and hope that the
| developers made it grepable._
|
| It's funny, because my preference and actual use is the
| exact opposite: for a project that isn't my day-to-day, I'm
| _much_ more likely to try to grep through it rather than
| open it in an IDE.
| makeitdouble wrote:
| > if it's a project that isn't my day-to-day
|
| Another overlooked advantage of greppability is to be able
| to fuzzy the search, or discover related code that wasn't
| directly linked to what you were looking for.
|
| For instance if you were hunting for the method updating a
| `foo_bar` instance, grepping it will also give you
| instances of `generic_foo_bar` and `shim_foo_bar`. It can
| be noise, as it can be stuff you wouldn't have seen
| otherwise and save your bacon. If you're not familiar with
| a project I think it's quite an advantage.
|
| > hope that the developers made it grepable
|
| hopefully it's enforced at an organization level.
| gpderetta wrote:
| - you just switched branch/rebased and the index is not up to
| date.
|
| - the project is large enough that the IDE can't cope.
|
| - you want to also match comments, commented out code or in-
| project documentation
|
| - you want fuzzy search and match similarly named functions
|
| I use clangd integration in my IDE all the time, but often
| brute force is the right solution.
| joe-six-pack wrote:
| You forgot massive codebases. Language servers really
| struggle with anything on the order of the Linux kernel,
| FreeBSD, or Chromium.
| umanwizard wrote:
| clangd works fine for me with the linux kernel. For best
| results build the kernel with clang by setting LLVM=1 and
| KERNEL_LLVM=1 in the build environment and run
| ./scripts/clang-tools/gen_compile_commands.py after
| building.
| toprerules wrote:
| Ok, but now every time you switch commits you have to
| wait for clangd to reindex. Grepping in the kernel is
| just as fast and you can do it without running a 4GB+
| process that takes 10+ minutes to index
| umanwizard wrote:
| Sure. I wasn't intending to claim that there is no reason
| to care about greppability. Just providing some tips
| about getting clangd to work with linux for those who
| might find that useful.
| Groxx wrote:
| I honestly suspect that the amount of time spent dealing
| with the issues monorepos cause is net-larger than the
| gains most get from what a monorepo offers. It's just
| harder to measure because it tends to degrade slowly,
| happen to things you didn't realize you were relying on
| (until you need them), and without clear ways to point
| fingers at the cause.
|
| Plus it means your engs don't learn how to deal with open
| source code concerns, e.g. libraries, forking, dependency
| management. Which gradually screws over the whole
| ecosystem.
|
| If you're willing to put Google-scale effort into building
| your tooling, sure. Every problem is solvable. Only Google
| does that though, everyone else is getting by with a tiny
| fraction of the resources _and_ doesn 't already have a
| solid foundation to reduce those maintenance costs.
| ants_everywhere wrote:
| The projects mentioned were all single projects with
| single repos
| Groxx wrote:
| Sure. But those are far from the only massive codebases
| out there, and many of the biggest are monorepos because
| sorta by definition they are the size of _multiple_
| projects.
| umanwizard wrote:
| > Your language has #ifdef or equivalent syntax which does
| conditional compilation making syntactic tools incomplete.
|
| Your other points make sense, but in this case, at least for
| C/C++, you can generate a compile_commands.json that will let
| clangd interpret your code accurately.
|
| If building with make just do `bear -- make` instead of
| `make`. If building with cmake pass
| `-DCMAKE_EXPORT_COMPILE_COMMANDS=1`.
| camel-cdr wrote:
| Does it evaluate macros? Because macros allow for arbitrary
| computation.
| umanwizard wrote:
| The macros I see in the real world seem to usually work
| fine. I'm sure it's not perfect and you can construct a
| macro that would confuse it, but it's a lot better than
| not having a compilation db at all.
| codedokode wrote:
| "Go to definition" often doesn't work in dynamic languages
| like Python without type hints; it might not work when the
| code is dynamically generated.
| neves wrote:
| It always work in VSCode if your environment is correctly
| configured.
| beeboobaa3 wrote:
| > - Your language has #ifdef or equivalent syntax which does
| conditional compilation making syntactic tools incomplete.
|
| You need a better IDE.
|
| > - You just opened the project for the first time.
|
| Go grab a coffee
|
| > - It's in a language you don't daily drive
|
| Jetbrains all products pack, baby.
|
| > - You haven't even downloaded the project and are checking
| things out in github (or some similar site for your project).
|
| On GitHub, press `.` to open it in a web-based vscode.
| Download it & open it in your IDE while you are doing this.
|
| > - You're remoting via SSH and have access to code there
| (say it's a python server).
|
| Don't do this. Check the git hash that was deployed and
| checkout the code locally.
| neves wrote:
| > - You're remoting via SSH and have access to code there
| (say it's a python server).
|
| VSCode SSH Extension for the win.
| jvanderbot wrote:
| > - Your language has #ifdef or equivalent syntax which does
| conditional compilation making syntactic tools incomplete.
|
| LSP-based tools are fine with this, generally. A syntactic
| understanding is an incomplete solution. I suspect GP meant
| LSP. (as long as compile_commands.json or equivalent is
| avilable).
|
| Many of those other caveats are non-issues once LSPs are
| widespread. Even Github has lsp-like go-to-def/go-to-ref,
| though it's not perfect.
| PhilipRoman wrote:
| IDEs are cool and all, but there is no way I'm gonna let VSCode
| index my 80GB yocto tmp directory. Ctags can crunch the whole
| thing in a few minutes, and so can grep.
|
| Plus there are cases where grep is really what you need, for
| example after updating a particular command line tool whose
| output changed, I was able to find all scripts which grepped
| the output of the tool in a way that was broken.
| phyrex wrote:
| This breaks down at scale and across languages. All the FAANGs
| make heavy use of the equivalent of grepping in their code base
| gregjor wrote:
| I abandoned VSCode and went back to vim + ctags + ripgrep after
| a year with the most popular IDE. I miss some features but it
| didn't give me a 10x or even 1.5x improvement in my own work
| along any dimension.
|
| I attribute that mostly to my several decades of experience
| with vi(m) and command line tools, not to anything inherently
| bad about VSCode.
|
| What counts as "better" tools has a lot of subjectivity and
| circumstances implied. No one set of tools works for everyone.
| I very often have to work over ssh on servers that don't allow
| installing anything, much less Node and npm for VSCode, so I
| invest my time in the tools that always work everywhere, for
| the work I do.
|
| The main project I've worked on for the last few years has a
| little less than 500,000 lines of code. VSCode's LSP takes a
| few seconds fairly often to maintain the LSP indexes. Running
| ctags over the same code takes about a second and I can control
| when that happens. vim has no delays at all, and ripgrep can
| search all of the files in a second or two.
| wrasee wrote:
| Did you consider Neovim? You get the benefit of vim while
| also being able to mix in as much LSP tooling as you like.
| The tradeoff is that it takes some time to set up, although
| that is getting easier.
|
| That won't make LSP go any faster though. There's still
| something interesting in the fact that a ripgrep of every
| line in the codebase can still be faster than a dedicated
| tool.
| gregjor wrote:
| Considered it and have tried repeatedly to get it to work
| with mixed success. As you wrote, it takes "some time" to
| set up. In my case it would only offer marginal
| improvements over plain vim, since I'm not that interested
| in the LSP integration (and vim has that too, through a
| plugin).
|
| In the environments I often work in I can't install
| anything or run processes like node. I ssh into a server
| and have to use whatever came with the Linux distro, which
| means sticking with the tools I will find everywhere. I
| can't copy the code from the server either. If I get lucky
| they used version control. I know not everyone works with
| those constraints. I specialize in working on abandoned and
| legacy code.
| wrasee wrote:
| Yes ok. And legacy code might be a good example where
| grep works well, if it's fair to argue a greater
| propensity for things like preprocessors, older languages
| and custom builds that may not play as well with
| semantic-level tools, let alone be written with modern
| tooling in mind.
| gregjor wrote:
| Lol, I'm not working with COBOL or Fortran. Legacy code
| in my world means the original developers have left, not
| that it dates from the 1970s. Mostly I work with PHP,
| shell scripts, various flavors of SQL, Python, sometimes
| Rails or other stuff. All things modern LSPs can handle.
| kragen wrote:
| can you not upload executables over ssh, say for policy
| reasons or disk-space reasons? how about shell scripts?
|
| i mean, i doubt i'm going to come up with some brilliant
| breakthrough that makes your life easier that you've
| somehow overlooked, but i'd like to understand what kinds
| of constraints people like you often confront
|
| i'm just glad you don't have to use teamviewer
| gregjor wrote:
| I don't have to use TeamViewer, though I very
| occasionally have to use Windows RDP.
|
| You can transfer any kind of file over ssh. scp, sftp,
| rsync will all copy binaries. Mainly the issues come down
| to policy and billable time. Many of my customers simply
| don't allow installing _anything_ on their servers
| without a tedious approval process. Even if I can install
| things I might spin my wheels trying to get it to work in
| an environment I don 't have root privileges on, with no
| one willing to help, and I can't bill for that time. I
| don't work for free to get an editor installed. I use the
| tools I know I can find on any Linux/BSD server.
|
| With some customers I have root privileges and manage the
| server for them. With others their IT dept has rules I
| have to follow (I freelance) if I want to keep a good
| relationship. Since I juggle multiple customers and
| environments I find it simpler not having to manage
| different editors and environments, so I mostly stick
| with the defaults. I do have a .profile and .vimrc I copy
| around if allowed to, that's about it.
|
| I can't lose time/money and possibly goodwill whining
| about not having everything just-so for me. I recently
| worked on a server over ssh that didn't have tmux
| installed. Fortunately it did have screen, and I can use
| that too, no big deal. I spent less than 60 seconds
| figuring that out and getting to work rather than wasting
| hours of non-billable time annoying someone about how I
| needed tmux installed.
| kragen wrote:
| i see, thanks!
|
| wrt rdp, i feel like rdp is actually better than vnc or
| x11-over-ssh, but for cases where regular ssh works, i'd
| rather use ssh
|
| i wasn't thinking in terms of installing tmux, more like
| a self-contained binary that doesn't require any kind of
| 'installation'
| gregjor wrote:
| I used the word "install" but the usual rule says I can't
| install, upload, or execute any non-approved software.
| Usually that just gets stated as a policy, but I have
| seen Linux home directories on noexec partitions --
| government agencies and big corporations can get very
| strict about that. So copying a self-contained binary up
| and running it would violate the policy.
|
| I pretty much live in ssh. Remote Desktop means a lot of
| clicking and watching a GUI visibly repaint. Not
| efficient. Every so often I have customers using
| applications that only run on Windows, no API, no command
| line, so they will enable RDP to that, usually through a
| VPN.
| kragen wrote:
| i see! but i guess your .profile and .vimrc don't count?
| gregjor wrote:
| They aren't executables.
| kragen wrote:
| my cousin wrote a vt52 emulator in bash, and i was
| looking at a macro assembler written in bash the other
| day: https://github.com/jhswartz/mle-
| amd64/blob/master/amd64. i haven't seen a cscope written
| in bash, but you probably remember how the first versions
| of ctags were written in sh (or csh?) and ed. so there's
| not much limit to how far shell functions can go in
| augmenting your programming environment
|
| if awk, python, or perl is accepted, the possibilities
| expand further
| kelnos wrote:
| Sure, but this is taking things to a bit of an absurd
| extreme. If I worked in a restrictive environment where I
| couldn't install my own tools, I don't think I would be
| in a position to burn a ton of my employer's time
| building sophisticated development tools in bash.
|
| (One-off small scripts for things, sure. But I'm not
| going to implement something like ctags or cscope or a
| LSP server in bash.)
| kragen wrote:
| certainly it's absurd! nobody would deny that. on the
| other hand, the problem to solve is also an absurd
| problem
|
| and i wasn't suggesting trying to bill for doing it, but
| rather, if you were frequently in this situation, it
| might be reasonable to spend non-billable time between
| clients doing it
| gregjor wrote:
| I guess I don't see the problem as absurd. As a
| freelancer I need to focus on the problems the customer
| will pay for. I don't write code for free or in my spare
| time anymore, I used to years ago. i feel comfortable
| working with the constraints imposed, I think of that as
| a valuable skill, not a handicap.
| kragen wrote:
| i see. thank you very much for being willing to share
| your invaluable experience and hard-won wisdom
| VHRanger wrote:
| There's also helix now, which requires next to no setup,
| but requires learning new motions (subject is before the
| verb in helix)
| gregjor wrote:
| I looked at Helix but since I dream in vim motions at
| this point (vi user since it came out) I'd have to see a
| 10x improvement to switch. VSCode didn't give me a 10X
| improvement, I doubt Helix would.
| VHRanger wrote:
| Helix certainly won't give you a 10x improvement. It
| tends to convert a lot of people moving "up" from VS
| Code, and still a decent chunk, but certainly fewer
| neovim users moving "down".
|
| Advantages of Helix are pretty straightforward:
|
| 1. Very little configuration bullshit to deal with.
| There's not even a plugin system yet! You just paste your
| favorite config file and language/LSP config file and
| you're good to go. For anything else, submit a pull
| request.
|
| 2. Built in LSP support for basically anything an LSP
| exists for.
|
| 3. There's a bit of a new generation command line IDE
| forming itself around zellij (tmux that doesn't suck) +
| helix + yazi (basically nnn or mc on crack, highly
| recommended).
|
| That whole zellij+helix+yazi environment is frankly a joy
| to work in, and might be the 2-3x improvement over neovim
| that makes the switch worth it.
| gregjor wrote:
| Like I wrote, I looked at Helix. Seems cool but not
| enough for me to switch. And I would have to install it
| on the machines I work on, which very often I can't do
| because of company policies, or can't waste the non-
| billable time on.
|
| I only recently moved from screen to tmux, and I still
| have to fall back to screen sometimes because tmux
| doesn't come with every Linux distro. I expect I will
| retire before I think tmux (or screen, for that matter)
| "sucks" to the point I would look at something else. And
| again I very often can't install things on customer
| servers anyway.
| VHRanger wrote:
| Tmux does suck pretty bad though?
|
| It conflicts with the clipboard and a bunch of hotkeys,
| and configuring it never works because they have breaking
| change in how their config file works ever 6months or so.
|
| These days I only use it to launch a long running job in
| ssh to detach the session it's on and leave.
| gregjor wrote:
| That's more or less what I use it for -- keeping sessions
| alive. I don't use 90% of the features. vim does splits,
| and there's ctrl-Z to background it and get a shell.
|
| I know I could get more out of tmux but haven't really
| needed to. I use it with the default config. I have
| learned from experience that the less I try to customize
| my environment the less non-billable time I waste trying
| to get that working and maintaining it.
| gen220 wrote:
| You can use a tool like ALE (the Asynchronous Linting
| Engine) to run LSPs in normal-Vim; I've been doing it for
| years and have no complaints! It's rapid.
| joe-six-pack wrote:
| VSCode is not an IDE, it's an extensible text editor. IDEs
| are _integrated_ (it 's in the name) and get developed as a
| whole. I'm 99% certain that if you were forced to spend a
| couple of months in a real IDE (like IDEA or Rider), you
| would not want to go back to vim, or any other text editor.
| Speaking as a long time user of both.
| gregjor wrote:
| I get your point, but VSCode does far more than text
| editing. The line between an advanced editor and an IDE
| gets blurry. If you look at the Wikipedia page about
| IDEs[1] you see that VSCode ticks off more boxes than not.
| It has integration with source code control, refactoring, a
| debugger, etc. With the right combination of extensions it
| gets really close to an IDE as strictly defined. These days
| advanced text editor vs. "real" IDE seems more like a
| distinction without much of a difference.
|
| You may feel 99% certain, but you got it wrong. I have
| quite a bit of experience with IDEs, you shouldn't assume I
| use vim out of ignorance. I have worked as a programmer for
| 40+ years, with development tools (integrated or not) that
| I have forgotten the names of. That includes "real" IDEs
| like Visual Studio, Metrowerks CodeWarrior, Symantec Think
| C, MPW, Oracle SQL Developer, Turbo Pascal, XCode, etc. and
| so on. When I started programming every mainframe and
| minicomputer came with an IDE for the platform. Unix came
| along with the tools broken out after I had worked for
| several years. In high school I learned programming on an
| HP-2000 BASIC minicomputer -- an IDE.
|
| So I have spent more than "a couple of months in real IDEs"
| and I still use vim day to day. If I went back to C++ or C#
| for Windows I would use Visual Studio, but I don't do that
| anymore. For the kind of work I do _now_ vim + ctags +
| ripgrep (and awk, sed, bash, etc.) get my work done. At my
| very first real job I used PWB /Unix[2] -- PWB means
| Programmer's Work Bench -- an IDE of sorts. I still use the
| same tools (on Linux) because they work and I can always
| count on finding a large subset of them on any server I
| have to work with.
|
| I don't dislike or mean to crap on IDEs. I have used my
| share of IDEs and would again if the work called for that.
| I get what I need from the tools I've chosen, other people
| make different choices, no perfect language, editor, IDE,
| what have you exists.
|
| [1] https://en.wikipedia.org/wiki/Integrated_development_en
| viron...
|
| [2] https://en.wikipedia.org/wiki/PWB/UNIX
| mgsouth wrote:
| What IDE existed for the HP 2000? (Where I learned, too.
| In Portland :)
| gregjor wrote:
| Me too -- in Portland, Cleveland HS, mid-70s.
|
| The HP 2000 [1] had a timeshared BASIC system that the
| school district made available to schools, over ASR-33
| teletypes with dial-up modems. The BASIC system could
| edit, run (translate to byte code and execute), manage
| files. No version control or debuggers back then. The HP
| 2000 had another layer of of the operating system
| accessible to administrators (the A000 account if I
| remember right) but it was the same timeshared BASIC
| system with some additional commands for managing user
| accounts and files.
|
| No one familiar with modern IDEs would recognize the HP
| 2000 BASIC system as an IDE, but it was self-contained
| and fully integrated around writing BASIC programs. HP
| also offered FORTRAN for it but not under the timeshared
| BASIC system. A friend wrote an assembler (in BASIC!) and
| taking advantage of a glitch in the bytecode interpreter
| we could load and run programs written in assembly
| language.
|
| After high school I got a job as night computer operator
| with the Multnomah County ESD (school district) so I had
| admin access to the HP 2000, and their two HP 3000
| systems, and an IBM computer they used for crunching
| class registrations. Good times.
|
| Someone had an emulator online for a while, accessible
| over telnet, but I can't find it now.
|
| [1] https://en.wikipedia.org/wiki/HP_Time-Shared_BASIC
| kragen wrote:
| i think it's very reasonable to describe time-shared
| basic systems like that as ides. the paradigmatic example
| of an 'ide' is probably turbo pascal 1.0, and of the
| features that separated turbo pascal from 'unintegrated'
| editor/compiler/assembler/linker/debugger setups, i think
| the dartmouth timesharing system falls mostly on the
| 'ide' side of the line. you could stop your program at
| any point and inspect its variables, change them,
| evaluate expressions, _change the source code_ , and
| continue execution. runtime errors would also pop you
| into the interactive basic prompt where you could do all
| those things. i believe the hp 2000 timesharing basic had
| all these features, too
| gregjor wrote:
| At the time, in the context of other software development
| environments (like submitting decks of punch cards) the
| HP 2000 timeshared BASIC environment would count as an
| IDE. Compared to Turbo Pascal or any modern IDE it falls
| short.
|
| HP TSB did not have a REPL. If your program crashed or
| stopped you could not examine variables from the
| terminal. You could not peek or poke memory locations as
| you could with microcomputer BASICs (which didn't support
| multiple users, so didn't have the security concern). You
| had to insert PRINT statements to debug the code. TSB
| BASIC didn't have compile/link steps, it tokenized the
| code as you entered the lines, and the interpreter
| amounted to a big switch statements on the tokens. P. J
| Brown's book _Writing Interactive Compilers and
| Interpreters_ (1981) describes how TSB works. Eventually
| I got the source code to TSB (written in assembler) and
| figured it out for myself.
|
| Other BASIC implementations that popped up around the
| same time had richer feature sets. In my senior year at
| high school I got (unauthorized) access to a couple of
| Unix systems in Portland, ordered the _Bell Labs
| Technical Journal_ issues that described Unix and C, and
| taught myself from those. I didn 't get paid to work on a
| Unix system until several years later (detours into
| RSTS-11, TOPS-20, VMS, Microdata, Pr1me, others) but I
| caught the Unix and C bugs young and I still work with
| those tools every day.
|
| Some programmer friends and more than a few colleagues
| over the years have made fun of my continued use of what
| they call obsolete and arcane tools. I don't mind, I have
| never felt like I did less or sloppier work than my co-
| workers, and my freelance customers don't care what I use
| as long as I can solve their problems. Most of the work
| in programming doesn't happen at the keyboard anyway. I
| do pay attention and experiment with all kinds of tools
| but I usually end up going back to the Unix tools I have
| long familiarity with. That said I did spend many years
| in Visual Studio, MPW, CodeWarrior, and MPW writing C and
| C++ code, and I do think those tools (well, maybe not
| MPW) offered a lot of benefits over coding with vim and
| grep, for the projects I did back then.
|
| Maybe ironically I use an iPad Pro, I don't have a
| desktop or laptop anymore. So I have the most modern
| hardware and a touch-based (soon) AI-assisted operating
| system that runs a terminal emulator during my work time.
| kragen wrote:
| thank you! i didn't realize that it lacked the features
| of the microcomputer basics; i have the impression that
| they were copying the same dartmouth timesharing system
| that hp was copying, but of course i've never used dtss
| myself
|
| what kind of obsolete and arcane tools do you use? vim
| seems to be pretty popular among the youngsters these
| days. a friend of mine a bit younger than you prefers ex
| kelnos wrote:
| I think you're arguing semantics here in a way that's not
| particularly productive. VSCode can be set up in a way that
| is nearly as featureful as an IDE like IntelliJ IDEA or
| Eclipse, and the default configuration and OOB experience
| pushes you hard in that direction. VSCode is designed for
| software development, not as a general text editor; I would
| never open up VSCode to edit a configuration file or type
| up a text file of notes, for example.
|
| Something like vim is designed as a general text-editing
| tool. Sure, you can load it up with plugins and scripts
| that give you a bunch of features you'd find in an IDE, but
| the experience is not the same, and the "integrated" bit of
| "IDE" is still just not there.
|
| (And I say this as someone who does most of his coding in
| vim, with LSP plugins installed, only reaching for a
| "proper" IDE for Java and Scala.)
|
| One metric I would use: if I can sit down at a random co-
| worker's desk and feel more or less at home in their editor
| of choice, then it's probably an IDE that has reasonable
| defaults and is geared for software development. IDEA and
| VSCode would qualify... vim would certainly not.
| kelnos wrote:
| I have similar feelings... I still use IntelliJ IDEA for JVM
| languages, but for C, Rust, Go, Python, etc., I've been using
| vim for years (decades?), and that's just how I prefer to
| write code in those languages. I do have LSP plugins
| installed in vim for the languages I work in, and do have a
| key sequence mapped for jump-to-definition... but I still
| find myself (rip)grepping through the source at least as
| often as I j-t-d, maybe more often.
| sauercrowd wrote:
| strongly disagree here. This works if - your IDE/language
| server is performant - all the tools are fully set up - you
| know how to query the specific semantic entity you're looking
| for (remembering shortcuts) - you are only interested in a
| single specific semantic entity - mixing entities is rarely
| supported
|
| I dont map out projects in terms of semantics, I map out
| projects in files and code - That makes querying intuitive and
| I can easily compose queries that match the specificity of what
| I care about (e.g. I might want to find a `Server` but I want
| to show both classes, interfaces and abstract classes).
|
| For the specific toolchain I'm using - typescript - the symbol
| search is also unusable once it hits a certain project size,
| it's just way too slow for it to be part of my core workflow
| db48x wrote:
| True, but IDEs are fragile tools. Sometimes you want to fall
| back to simpler tools that will always work, and grep is not
| fragile.
| cxr wrote:
| The basis if this article (and its forebear "Too DRY - The
| Grep Test"[1]) is that grep _is_ fragile. It 's just fragile
| in a way that's different from the way that IDEs are fragile.
|
| 1. <http://jamie-wong.com/2013/07/12/grep-test/>
| kragen wrote:
| posts like this sound like the author routinely solves harder
| problems than you are, because the solutions you suggest don't
| work in the cases the post is about. we've had 'go to
| definition' since 01978 and 'find usages' since 01980, and you
| should definitely use them for the cases where they work
| mjr00 wrote:
| From the article,
|
| - dynamically built identifiers is 100% correct, never do
| this. Breaks both text search and symbol search, results in
| complete garbage code. I had to deal with bugs in early
| versions of docker-compose because of this.
|
| - same name for things across the stack? Shouldn't matter,
| just use find usages on `getAddressById`. Also easy way to
| bait yourself because database fields aren't 1:1 with front-
| end fields in anything but the simplest of CRUD webshit.
|
| - translation example: the fundamental problem is using
| strings as keys when they should be symbols. Flat vs nested
| is irrelevant here because you should be using neither.
|
| - react component example: As I mentioned in another comment,
| trivially managed with Find Usages.
|
| Nothing in here strikes me as "routinely solves harder
| problems," it's just standard web dev.
| kragen wrote:
| yes, i agree that standard web dev is full of these
| problems, which can't be solved with go-to-definition and
| find-usages. it's a mess. i wasn't claiming that these
| messy, hard problems where grep is more helpful than etags
| are exotic; they are in fact very common. they are harder
| than the problems lucumo is evidently accustomed to dealing
| with because they don't have correct, complete solutions,
| so we have to make do with heuristics
|
| advice to the effect of 'you should not make a mess' is
| obviously correct but also, in many situations, unhelpful.
| sometimes i'm not smart enough to figure out how to solve a
| problem without making a mess, and sometimes i inherit
| other people's messes. in those situations that advice
| decays into 'you should not try to solve hard problems'
| lucumo wrote:
| > they are harder than the problems lucumo is evidently
| accustomed to dealing with because they don't have
| correct, complete solutions, so we have to make do with
| heuristics
|
| Funny.
|
| But since you asked. The hardest problems I've solved
| haven't been technical problems for years. Not that I
| stopped solving technical problems, or that I started
| solving only the easier problems. I just learned to solve
| people problems more.
|
| People problems are much harder than technical problems.
|
| The author showed a simple people problem: someone who
| needs to know about better tooling. If we were working
| together, showing them some tricks wouldn't take much
| time and would improve their productivity.
|
| An example of a harder problem is when someone tries to
| play aggressive little word games with you. For example,
| trying to put you down by loudly making assumptions about
| your career and skills. One way to deal with that is to
| just laugh it off. Maybe even make a self-deprecating
| joke. And then continuing as if nothing happened.
|
| But that assumes you want or have to continue working
| productively with them. If you don't, it can be quite
| enjoyable to just laugh in their face. After all, it's
| never the sharpest tool in the shed, or the brightest
| light that does that. In fact, it's usually the least
| useful person around, who is just trying to hide that
| fact. Of course, once you realize that, it becomes hard
| to laugh, because it's no longer funny. Just sad and
| pitiful.
| kragen wrote:
| > _look! i already told you! i deal with the god damned
| customers so the engineers don 't have to! i have people
| skills! i am good at dealing with people! can't you
| understand that? what the hell is wrong with you people?_
|
| (office space,
| https://www.youtube.com/watch?v=hNuu9CpdjIo)
|
| look, lucumo, i'm sure you have excellent people skills.
| which is why you're writing five-paragraph power-trip-
| fantasy comments on hn about laughing in people's faces
| as you demonstrate your undeniable dominance over them,
| then take pity on them. but i'm not sure those comments
| really represent a contribution to the conversation about
| code greppability; they're just ego defense. you probably
| should not have posted them
| kragen wrote:
| (edited to remove things that could be interpreted as a
| personal attack, since i've gotten feedback that my
| previous phrasing was too inflammatory)
|
| you aren't the first person i've seen expressing the
| transparently nonsensical sentiment that 'people problems
| are much harder than technical problems'. i've seen it
| over and over again for decades, but i've never seen a
| clear and convincing explanation of why it's nonsense; i
| think this is worth discussing in some depth
|
| an obvious thing about both people problems and technical
| problems is that they both cover a full spectrum of
| difficulty from trivial to impossible. a trivial people
| problem is buying a soft drink at a convenience store+. a
| trivial technical problem is tying your shoes. an
| impossible people problem is ending poverty. an
| impossible technical problem might be finding a
| polynomial-time decision procedure for an np-complete
| problem, or perhaps building a perpetual-motion machine,
| or a black-hole generator. both kinds of problems have
| every degree of difficulty in between, too. stable blue
| leds seemed like an impossible technical problem until
| shuji nakamura figured out how to make them. conquering
| asia seemed like an impossible people problem until
| genghis khan did it
|
| even within the ambit of modifying a software system,
| figuring out what parts of the code are affected by a
| possible change, there are trivial technical problems and
| problems that far exceed current human capacities. nobody
| knows how to write a bug-free web browser or how to
| maintain the linux kernel without introducing new bugs
|
| given this obvious fact, what are we to make of someone
| saying, 'people problems are much harder than technical
| problems'? obviously it isn't the case that _all_ people
| problems are much harder than _all_ technical problems,
| given that some people problems are easy, and some
| technical problems are impossible. and if we interpret it
| as meaning that _some_ people problems are much harder
| than _some_ technical problems, it 's a trivial tautology
| which would be just as true if we reversed the terms to
| say '[some] technical problems are much harder than
| [some] people problems'. so nobody would bother making
| the effort to say it unless they thought someone was
| asserting the equally ridiculous position that all people
| problems were easier than technical problems
|
| the most plausible interpretation is that it means that
| the people problems _the speaker is most familiar with_ ,
| and therefore considers typical, are much harder than the
| technical problems _the speaker is most familiar with_.
| it 's not a statement about the world; it's a statement
| about the author and the environment they're familiar
| with
|
| we can immediately deduce from this that you are not
| andrew wiles, who spent six years working alone on a
| technical problem which had eluded the world's leading
| mathematicians for some 350 years, for the solution of
| which he was appointed a knight commander of the order of
| the british empire and awarded the abel prize, along with
| a long list of other prizes. you give the appearance of
| being so unfamiliar with such difficult technical
| problems that you cannot imagine that they even exist,
| though surely with a little thought you can see that they
| do. in any case, for a very long time, you have not been
| working on any technical problems that seem impossible to
| you. i believe you that it's not that you _started_
| solving only the easier problems; that means that all the
| problems you ever solved were the easier problems
|
| or, more briefly, you aren't accustomed to dealing with
| difficult technical problems
|
| perhaps we can also infer that you frequently handle very
| difficult people problems--perhaps you are a politician
| or a clinical psychologist in a mental institution, or
| you have family members with serious mental illness.
| however, other aspects of your comment make that seem
| relatively unlikely
|
| ______
|
| + if you have no money or don't speak the local language,
| this people problem becomes less trivial
| hyperpape wrote:
| I can run rg over my project faster than I can do anything in
| my IDE. Both tools have their places.
| mjr00 wrote:
| > Honestly, posts like this sound like the author needs to
| invest some time in learning about better tools for his
| language. A good IDE alone will save you so much time.
|
| Completely agreed. The React component example in the article
| is trivial solvable with any modern IDE; right click on class
| name, "Find Usages" (or use the appropriate hotkey, of course).
| Trying to grep for a class name when you could just do that is
| insane.
|
| I mainly see this from juniors who don't know any better, but
| as seen in this thread and the article, there are also
| experienced engineers who are stubborn and refuse to use tools
| made after 1990 for some reason.
| gpderetta wrote:
| I worked on codebases large enough where enabling
| autocomplete/indexing would lock the IDE and cause the
| workstation to swap hard.
| alternatex wrote:
| That's a problem of code organisation though. Large
| codebases should be split into multiple repos. At the end
| of the day code structure is not something to be decided
| only by compilation strategy, but by developer ergonomics
| as well. A massive repo is a massive burden on
| productivity.
| EasyMark wrote:
| It seems like the law of diminishing returns; while I'm sure in
| a few cases this characteristic of a code writing style is
| extremely useful, it cuts into other things such as readability
| and conciseness. Fewer lines can mean fewer bugs, within
| reason, if you aren't in lisp and are using more than 3
| parentheses, you might want to split it up because the
| compiler/JIT/interpreter is going to anyway.
| brooke2k wrote:
| with all due respect, it sounds like you have the privilege of
| working in some relatively tidy codebases (and I'm jealous!)
|
| with a legacy codebase, or a fork of a dependency that had to
| be patched which uses an incompatible buildsystem, or any
| C/C++/obj-c/etc that heavily uses the preprocessor or
| nonstandard build practices, or codebases that mix lots of
| different languages over awkward FFI boundaries and so on and
| so forth -- there are so many situations where sometimes an IDE
| just can't get you 100% of the way there and you have to revert
| to grepping to do any real work
|
| that being said, I don't fully support the idea of handcuffing
| your code in the name of greppability, but I think dismissing
| it as a metric under the premise that IDEs make grepping
| "obsolete" is a little bit hasty
| lucumo wrote:
| > with all due respect, it sounds like you have the privilege
| of working in some relatively tidy codebases (and I'm
| jealous!)
|
| I wish, but no. I've found people will make a mess of
| everything. Which is why I don't trust solutions that rely on
| humans having more discipline, like what this article
| advocates.
|
| In any situation where grep is your last saviour, you cannot
| rely on the greppability of the code. You'll have to check
| and double check everything, and still accept the risk of
| errors.
| jmmv wrote:
| Sure, if you have the luxury of having a functional IDE for all
| of your code.
|
| You can't imagine how much faster I was than everybody else at
| answering questions about a large codebase just because I knew
| how to use ripgrep (on Windows). "Knowing how to grep" is a
| superpower.
| kelnos wrote:
| Even with IDEs, I find that I grep through source trees fairly
| often.
|
| Sometimes it's because I don't completely trust the IDE to find
| everything I'm interested in (justifiably; sometimes it
| doesn't). Sometimes it's because I'm not looking to dive into
| the code and do serious work on it; I'm just doing a quick
| drive-by check/lookup for something. Sometimes it's because I'm
| ssh'd into another machine and I don't have the ability to
| easily open the sources in an IDE.
| wglb wrote:
| A bit on the other side of the argument, I use grep plus find
| plus some shell work to do source code analysis for security
| reviews. grep doesn't really understand the syntax of
| languages, and that is mostly OK.
|
| I've used this technique on auditing many code bases including
| the C family, perl, Visual Basic, C# and SQL.
|
| With this sort of tool, I don't need to look for language-
| particular parsers--so long as the source is in a text file,
| this works well.
| umvi wrote:
| Interface-heavy languages break IDEs. In .NET at least, "go to
| definition" jumps you to the _interface_ definition which you
| probably aren 't interested in (vs. the specific implementation
| you are trying to dig into). Also with .NET specifically XAML
| breaks IDE traceability as well.
| ilrwbwrkhv wrote:
| I tried a good IDE recently: Jetbrains IntelliJ and Webstorm.
| Considered the topdog of IDEs. Was working on a typescript
| project which uses npm link to symlink another local project
| into the node_modules of current project.
|
| The great IDEs IntelliJ and Webstorm stopped autosuggesting
| completions from the symlinked project.
|
| Open up Sublime Text again. Worked perfectly. That is why
| Jetbrains and their behemoth IDEs are utter shite.
|
| Write your code to have symmetry and make it easy to grep.
| 71bw wrote:
| >I tried a good IDE recently: Jetbrains IntelliJ
|
| Having dealt with IntelliJ for 3 years due to education stuff
| - I laughed out here. Even VS is better than ideaj.
| mihaaly wrote:
| "A good IDE"
|
| I am also waiting for world peace! ; )
| groby_b wrote:
| Working on a 32MLOC project, text search is still the quickest
| way to find a hook that gets you to the deeper investigation.
| From _there_ , finding definitions/usage definitely matters.
|
| You can maybe skip the greppability if the code base is of a
| size that you can hold the rough shape and names in your head,
| but a "get a list of things that sound like they might be
| related to my problem" operation is still extremely helpful.
| And it's also worth keeping in mind that greppability matters
| to onboarding.
|
| Does that mean it should be an overriding design concern? No.
| But it does mean that if it's cheap to build greppable, you
| probably should, because it's a net positive.
| mrb wrote:
| I will always remember my professor explaining that greppability
| is the reason C++ casting operators use a long keyword:
| static_cast<...> const_cast<...>, etc as you can easily grep for
| "_cast" or the whole keyword.
| whirlwin wrote:
| Code grepping at build time can be useful.
|
| Grepping at at runtime, if you can call it that, is also very
| powerful. If you have a binary, either your company or a third
| party one, but don't have the source code easily available, I
| have used the `strings` program from GNU binutils which shows
| tokens in binary code, e.g. hardcoded URLs, credentials and so
| on. It can also be useful for analyzing certain things in memory.
| jonathanyc wrote:
| Related--"Too DRY - The Grep Test" by Jamie Wong: http://jamie-
| wong.com/2013/07/12/grep-test/
| IshKebab wrote:
| This is why I always recommend avoiding kebab-case as much as
| possible. You'll eventually need to convert it to snake_case and
| now you have broken grep. (Nobody is going to remember to use a
| regex every time.)
| creesch wrote:
| I fully understand the point the author is making. However, I am
| not going to sacrifice good JSON and make it flat just so someone
| can search for it more easily. With the example they give, it is
| still readable because it is a simple data structure. But with
| more complex data their flat structure to me does not make it
| easy to parse and easier to make mistakes as well.
| smartmic wrote:
| It's ofter a matter of having the right tool for the job. In
| your case, https://github.com/tomnomnom/gron might be useful.
| creesch wrote:
| Well, I'd say that in the author's case it might be more
| useful. ;) I never really had the inclination to grep for
| data like the author does.
|
| I generally work from an IDE anyway, where it is clear that I
| am working with a value that is part of a JSON object and I
| can follow it back to the proper structure anyway. In fact,
| the more I think about it, the more I feel like the article
| is written for a very specific use case and perspective.
| Almost to the point where the saying " _if all you have is a
| hammer, everything looks like a nail_ " is applicable. Where
| if it doesn't look enough like a nail it should be adjusted
| to look more like one instead of expanding your toolbox a
| bit.
| shahzaibmushtaq wrote:
| This reminds me of the _good practices and guidelines_ in coding
| when I was learning to code, which also includes "proper naming"
| so you can easily find what you are looking for throughout the
| codebase.
| berkes wrote:
| Me too.
|
| But that's also what makes me uncomfortable when reading this
| article. _Proper Naming_ is truly an "art" of balancing trade-
| offs.
|
| It takes domain expertise (Ubiquitous language), understanding
| of the users of the code (other devs, not end-users), and a
| lifetime of coding f*ups where naming something wrong turned
| out painful to balance these.
|
| The author gives a nice example of a dynamic table naming. But
| their refactoring didn't keep the behaviour the same (the
| else/catch). So it's hard to argue the first is better. And in
| this case, even without the else/catch, I'd say the latter is
| better. But there will be cases where greppability is to
| balanced with readability, testability or refactorability. And
| in these cases, for me, greppability comes last.
| VoxPelli wrote:
| I advocate for greppability as well - and in Swedish it becomes
| extra fun - as the equivalent phrase in Swedish becomes "grep-
| bar" or "grep-barhet" and those are actual words in Swedish -
| "greppbar" roughly means "understandable", "greppbarhet" roughly
| means "the possibility to understand"
| elygre wrote:
| Could I suggest that greppbarhet is more precisely translated
| as "the ability of being understood"?
|
| (Norwegian here. Our languages are similar, but we miss this
| one.)
| psychoslave wrote:
| So, at the extrem opposite of the esoteric "general regular
| expression print" that grep stands for with few ever knowing
| it?
| johncoltrane wrote:
| s/general/global
| medstrom wrote:
| Norwegian still translates grep as "grip"/"grab". I always
| thought of grepping as reaching in with a hand into the text
| and grabbing lines. That association is close at hand (insert
| lame chuckle) for German and English speakers too.
| pbhjpbhj wrote:
| In English that association is going to depend a lot on
| one's accent; until now I've never associated grep-ing with
| anything other than using grep! (But, equally, that might
| just be a me thing.)
| bee_rider wrote:
| It doesn't sound anything like grip in my accent but for
| some reason the association has always been there for me.
| Grabbing or ripping parts from the file.
| medstrom wrote:
| What about _groping_? Groping around for text.
| sshine wrote:
| How many other UNIX commands did the Swedes adopt into their
| language?
|
| I know that they invented "curl". Do you _tar xfz_?
| lukan wrote:
| As far as I understood, it was part of the language before.
|
| The german equivalent of the word would be probably
| "greifbar". Being able to hold something, usually used
| metaphorically.
| kagevf wrote:
| > able to hold
|
| Would "grasp" work?
| octocop wrote:
| It's closer to grip
| n_plus_1_acc wrote:
| I've always related grep to grab
| trashtester wrote:
| "zu greifen" may best translate to "to grip", but "grip"
| has different mental connotations in English (it refers
| to mental stability, not intellectual insight).
|
| The best dual purpose translation of "zu greifen"/"gripe"
| (German/Scandinavian) meaning "zu
| begreifen"/"begripe"/"understand" would be "to grasp",
| which covers both physically grabbing into something and
| also to understand it intellectually.
|
| All these words stem back to the Proto-Indo-European
| gkrebk, which more or less completes the circle back to
| "grep".
| lordgrenville wrote:
| related to "grok"?
| trashtester wrote:
| grok /grak/
|
| Origin 1960s: a word invented by Robert Heinlein
| (1907-88), American author.
| actionfromafar wrote:
| Yes. "Grasping for straws."
| ManuelKiessling wrote:
| Which leads to "begreifbar", which I would
| explain/translate (badly) with "something is begreifbar if
| it can be understood".
| scbrg wrote:
| We do tar, for xfz I think you have to look to the Slavic
| languages :)
|
| Anyway, to answer your question: $ grep -Fxf
| <(ls -1 /bin) /usr/share/dict/swedish ack ar
| as black dialog dig du ebb
| ed editor finger flock gem
| glade grep id import last less
| make man montage pager pass pc
| plog red reset rev sed sort
| sorter split stat tar test
| transform vi
|
| :)
|
| [edit]: Ironically, _grep_ in that list is _not_ the same
| word as the one OP is talking about. That one is actually
| based on _grepp_ , with the double p. _grep_ means pitchfork.
| pbhjpbhj wrote:
| Pitchfork? As in something that might be used to search a
| haystack?? How delightful.
| sshine wrote:
| Yeah, that's one type.
|
| Another is for turning soil at a small scale by hand
| (also called a cultivator, I think).
|
| But they all have somewhat long prongs.
| tripzilch wrote:
| I learned from bash.org that "tar -xzvf" is in German accent
| for "xtract ze vucking files".
| Tmpod wrote:
| A classic
| Aachen wrote:
| Party pooper checking in: easier to remember is that v is
| the verbose option in most tools, x and f you already know,
| z is auto-detected for as long as I remember so you don't
| need to pass that. Add c for creating an archive and,
| congratulations, you can now do 90% of the tasks you'll
| ever want to do with tar, especially defusing xkcd bombs!
|
| (To go for 99%, add t for testing an archive to your
| repertoire. This is all I ever use; anything else I do with
| the relevant tools that I already know, like compression
| settings `tar c . | zstd -19 > my.tar.zstd` or extracting
| to a folder `cd /to/here && tar x ~/Downloads/ar.tar`. I'm
| sure tar has options for all this but that's not the _one
| thing_ it should do and do well.)
|
| I hadn't heard of the German option but I love it, shame
| really that z is obsolete :(
| tripzilch wrote:
| I mean, you're not wrong. Learning what stuff means is
| good :) But there's also the part where making up a
| ridiculous story, pun or such enables it being a very
| strong mnemonic.
|
| I know v is just the verbose option, though I didn't know
| z was autodetected.
|
| Way back (~15y or so?) I was reading bash.org just for
| the jokes cause I was on IRC, I knew what a tar/tar.gz
| file is, but I had never needed to extract one from the
| command line (might've been on Windows back then).
| However, because I remembered the funny joke, the first
| time I was on a Linux system confronted with a tgz, I
| knew exactly what to type :)
|
| Honestly to this day, I've never needed to create a tar
| archive, only to unpack them (when I need to
| archive+compress files it's usually to send to other
| people, and I pick zip cause everyone can deal with it).
| But `tar --help` and `man tar` are there in case I ever
| might.
| vanschelven wrote:
| Begreppelijk (begrijpelijk) in Dutch
| Cthulhu_ wrote:
| or "Grijpbaar" (grabbable)
| medstrom wrote:
| So Dutch/German make "begreif" a verb, for Swedish it is
| just a noun (that means "concept").
|
| But "begrijpelijk" has a clone: "begriplig". An adverb
| based on a verb in a foreign dictionary. There is no verb
| that goes "begreppa", it's just "greppa".
| trashtester wrote:
| "Jag kan inte begripa svenska."
| medstrom wrote:
| Oh, you're right.
| fedder wrote:
| The term concept itself suggests grasping or
| holding/taking hold of, see the latin verb concipio or
| adjective conceptus.
| jeroenhd wrote:
| Dutch also has a noun ("begrip") meaning "notion" or
| "understanding".
| octocop wrote:
| And we also have "begrepp", which is also a spin on content and
| understanding it's content.
| majewsky wrote:
| Oh, that's like German "begreifen", no? (Which means "to
| grok".)
| medstrom wrote:
| Grok is right! I'd translate Swedish "greppbar" directly as
| "grokkable"; "att greppa" as "to grok".
| TeMPOraL wrote:
| Which is ironic, given that the article is about making it
| easier to use grep _in order to avoid having to understand
| anything_.
| bob88jg wrote:
| Nah, you've got it backwards. The article isn't about dodging
| understanding - it's about making it way easier to spot
| patterns in your code. And that's exactly how you start to
| really get what's going on under the hood. Better searching =
| faster learning. It's like having a good map when you're
| exploring a new city
| TeMPOraL wrote:
| The article advocates making code harder to understand for
| the sake of better search. It's like forcing a city to
| conform to a nice, clean, readable map: it'll make
| exploring easier for you, at the cost of making the city
| stop working.
| layer8 wrote:
| Graspability. ;)
|
| More customarily: intelligibility.
| mettamage wrote:
| greppbarhet
|
| Grijpbaarheid
|
| I never saw grep as grijp
|
| I guess I do now
|
| (Dutch btw)
| anordal wrote:
| Setting a variable by split identifier is surprisingly common in
| CMake (because functions can't return a value):
|
| > set(${VAR}_VERSION ${VERSION})
|
| This is the main reason I don't like CMake.
| nsonha wrote:
| is there something like an universal "semantic grep" for code? I
| think rating code based on some (limitation of a) tool, might not
| be the best way.
| traxys wrote:
| I read parts of the Linux kernel source code pretty often, and
| getting the definition of a function is often pretty involved:
|
| - I don't always know the return code type, as the calling code
| assigned a field whose definition I don't know to find either
|
| - I don't know if it's a C function or a preprocessor macro
|
| This often results in me searching for the exact function name,
| and combing through the uses in the drivers. You then need to re-
| start all that recursively to understand the function you just
| read.
|
| I could use clangd for that, but I don't have the ressources on
| my laptop to compile a kernel
| dvh wrote:
| Why not simply hold Ctrl and click on the name of the function?
| GeneralMayhem wrote:
| > I don't have the resources on my laptop to compile a kernel
| gregjor wrote:
| ctags?
| trussy wrote:
| You might find this site useful: https://elixir.bootlin.com
| hcfman wrote:
| .* is your friend :)
| larsrc wrote:
| As an avid grepper, I disagree with most of these specific
| recommendations. Use a tool that actual understands references.
| Don't make the code harder to read for humans just to please
| grep.
|
| As for identifiers, use 'foo.?bar' case-insensitively.
| medstrom wrote:
| Which of the examples are harder to read for humans?
| alkonaut wrote:
| There are of course cases of dynamic data in every language (The
| table name is an apt example) but usually when I look in code I
| just expect to be able to follow definitions. If the language
| doesn't reliably allow me to find "usages of this type" without
| risking finding _another type with the exact same name_ then I 'm
| already starting up my static type system compiler for the
| rewrite.
|
| There are exceptions of course: when searching git logs, comments
| etc doesn't help what the language or IDE does.
|
| And when searching for an unknown symbol (type, function,
| variable) you don't know the name of, but you know _should_ look
| like "Dog _Order " or "Order_Dog" is a common task too. In this
| case I'd probably search for " Dog. _Order\\( " or "
| Order._Dog\\(" if I'm looking for a function. The language trait
| that enabled it is that method names are Pascal Case and always
| have an opening ( at the end. But my IDE at least lets me search
| for members (variables, functions) separate from type names.
| There should be an _index_ in the IDE though that lets you query
| this data. E.g. looking for types starting with foo could be done
| with search t:Foo, instead of having to grep for "(struct|class)
| Foo" or similar. Tooling is the key.
| berkes wrote:
| The author uses JavaScript and Python as examples. So I presume
| they have (most?) experience with dynamic languages.
|
| In static languages, greppability is hardly as much as a
| factor. Especially with the availability of LSPs and other such
| tools nowadays.
|
| When I write rust, or Java, I hardly grep, I "go to usages" or
| "go to definition", "rename symbol" and so on. Similar, but not
| to that extent, with typescript. But when coding in Javascript,
| Ruby or Python, no matter how fancy or language-focused an IDE
| is, I'll be grepping a lot. Decades of Ruby and Rails "black
| magic" taught me to grep for partial patterns like the author
| shows, too. Or to just run the code-path entirely (through
| tests) because the table-definition of the database will change
| the available methods and behaviour of the code. Yes. I know.
|
| An LSP (or linter, or checker) can only do so much when the
| available code, methods, classes, behaviour can be changed or
| added at runtime.
| alkonaut wrote:
| I'm happy to use dynamic languages occasionally too (Bash,
| Javascript, Python, ..) but I have a rule of thumb that says
| if I can't see the entire codebase on one screen, then it's
| too large for dynamic.
| pistoleer wrote:
| Would be great if the wider industry shared that view
| LegionMammal978 wrote:
| Then again, in languages like Java that go all-in on object
| orientation and dynamic dispatch, "go to definition" can
| become a real chore in sufficiently large codebases. I must
| have wasted hours of my life trying to find which class is
| the one implementing some interface or abstract method at a
| certain point. Bonus points if the implementing class changes
| based on ordinary runtime conditions.
| atoav wrote:
| One very simple way to make code less greppable is to use only
| single leter variables or other short variables that are very
| likely to be contained in a ton of other words.
| ljsprague wrote:
| In other words: don't try to be clever?
| kmarc wrote:
| Some good recommendations in the article.
|
| Greppability is also helpful when you start scripting your
| editor. Vim has `includeexpr` and co. to implement some
| "intelligence" when trying to find declarations etc. This enabled
| me to write a couple line snippet that immediately could resolve
| Bazel starlark symbols even in "imported" (`load()`) files. At
| one point I realized I have better code navigation than any of my
| colleagues using IDEs.
|
| This, and tools like ripgrep really help a lot. This is something
| that VS Code developers also realized when indlcluded ripgrep
| itself as their "backend" of searching in files.
| emblaegh wrote:
| Python people should think twice before implementing a `__call__`
| method if they want to improve greppability.
| vijucat wrote:
| One other thing I'd like to add is greppable comments! In the
| same vein as TODO and FIXME, I use hashtags in comments to drop
| hints to future me reading the code. #learning is a universal
| one:
|
| // #learning: transparent color using color.new(color.white,
| 100). This is GREAT for hiding plot() lines during inapplicable
| periods (such as when no trade is on)
|
| But project-specific hashtags are quite useful, too.
|
| // #60within600: bunch API calls to not hit the 60 calls within
| 10 minutes limit
|
| // This memoizes fn call results to prevent #60within600
|
| The hashtagging was inspired long ago by del.icio.us, if you
| remember that. https://en.wikipedia.org/wiki/Delicious_(website)
| philipwhiuk wrote:
| Seems like you're trying to implement a ticketing system in
| your code to me.
|
| If you need to prevent 60 within 600, write a test.
| vijucat wrote:
| I'm documenting the (scattered) implementation changes
| related to the feature / bug-fix, in both code and test
| cases, using the hashtag. You could just use the JIRA ticket
| number, too, but it's a bit dry.
| TeMPOraL wrote:
| Grep is indeed a critical tool for navigating and understanding
| an unfamiliar codebase, but greppability should not be a goal
| unto itself. The article seems to be making that mistake - it's
| basically advocating improving greppability at the cost of making
| the codebase even larger, messier, and harder to read: i.e.
| reinforcing the problem that makes you reach for grep in the
| first place. It's a false economy. It's asking you to optimize
| your code for one specific scenario - trying to figure out where
| an unfamiliar string comes from; but that isn't the most
| important or most frequent thing people need to do with code
| anyway.
|
| (If it is for you, congratulations, you're the janitor in the
| codebase. It sucks, but that's what you're being paid for.
| Maintenance is a means, not an end.)
|
| In particular, one of the most important and frequent thing you
| do with code is _read it_ in order to understand it (locally, at
| the abstraction level of interest), and the advice from this
| article compromise it badly - almost as if hoping that, on a
| greppable enough codebase, you could use grep to _avoid reading
| or thinking entirely_.
|
| 1. Don't split up identifiers
|
| Don't split them up for the sake of splitting, sure. That's not
| helping anything. But in the example given, there's likely a good
| reason for it - for example, it codifies the intended coupling
| between tables. `billing_address` isn't an independent term in
| this code, nor are the other `_address` table names. There's a
| naming pattern there, encoded directly in the initial example.
| The proposed refactor obscures it _and_ triples the amount of
| code in the process (all of which is low-value noise) _and_
| introduces possibility of making errors (typos, copy-paste) of
| the kind that isn 't picked up by compilers (hope you have good
| tests!).
|
| FWIW, the author's refactor _may_ be eventually required - if and
| when the naming pattern in the original code no longer holds. But
| not before then.
|
| 2. Use the same names for things across the stack
|
| Excessive data repackaging is bad, but that tends to be a symptom
| of having too many layers. A good layer has specific semantics
| that distinguish it from layers above and below it. This may
| necessitate renaming some thing, in which case even if such
| renaming is as trivial as in the example, it should be spelled
| out explicitly; you can't just return Layer 1 Address object
| instead of Layer 3 Address object, if the two layers mean
| something different by "Address"; the triviality of the mapping
| is incidental and may not hold over time. If it really feels
| trivial, chances are one of the layers is not necessary in the
| first place, so go fix that.
|
| 3. Flat is better than nested
|
| Now that's just screwing with people, especially wrt. nesting
| namespaces. It's asking to reintroduce the visual noise that the
| person reading the code will then have to filter out again
| mentally.
|
| The way I see it, if you grep for some log message or unrolled
| identifier and can't find it, you're supposed to _keep grepping
| for parts of the string_ , until you hit a match. You then go
| look, and it's usually apparent that you're dealing with a
| compound identifier or an interpolated string - congratulations,
| you just learned something important about that part of the
| legacy codebase, _which is the real job you 're supposed to be
| doing_.
| noufalibrahim wrote:
| I don't know how to validate this but this seems to be a specific
| case of "avoiding magic" where there's a lot of dynamically
| generated variables and things. Having the static text of the
| program more or less show its intent helps readability and
| searchability quite a bit.
|
| I suppose the other extreme is to have a program generator with
| an input spec and you being left to read through the generated
| code without access to the input spec.
| moomin wrote:
| If you really do want your code to be searchable, here's a couple
| of practices I've adopted:
|
| 1) Eliminate spelling mistakes. Eliminate alternative spellings.
| UK vs US English? Pick a side and stick to it.
|
| 2) Eliminate contractions. Or keep a very short list of allowable
| ones (We permit "info" for instance.)
|
| The point of this is to increase the predictability of the names
| you use. If you've got "tradeable" and "tradable" in your code
| base, search for it is going to be a pain. You can supplement
| these rules with common coding standards like "We call these
| things providers." but just getting the spelling consistent is
| huge.
| peanut-walrus wrote:
| These are all extremely good suggestions. Especially the
| flattening bit - yes, it's verbose as hell, but it just makes so
| much sense whenever you have to deal with the code any time after
| writing it. Helm charts, please take note, the docs even say that
| "In most cases, flat should be favored over nested.", yet almost
| every time I have to deal with a Helm chart, it's a mess of
| nested structures.
| qwertox wrote:
| I use greppable strings explicitly, like
| requests.get(f'http://a.b.c.d/wol?device={wol_computer}&grep-
| id=wake-on-lan', timeout=3)
|
| This way I find `grep-id` in the server logs as a reminder of
| what to grep for, then `grep-id=wake-on-lan` in the entire
| codebase to find the actual source of the call.
|
| Or I add comments with a grepable token to the code.
| arendtio wrote:
| I am firmly against the suggested changes. I love grepping
| through code too (often using -A -B -C), but I also like browsing
| the code, with tools where you can just click on a function and
| see its definition.
|
| However, changing how the code should be written so that grepping
| becomes easier is optimizing for the wrong target. It is much
| more important that the code is easily readable and maintainable.
|
| In addition, some tools are designed explicitly for grepping
| through code (from the top of my head ack is an example). If grep
| doesn't work, one should try a more sophisticated tool instead of
| using different coding styles.
| gregjor wrote:
| Nothing the author wrote would necessarily make code harder to
| read or maintain. Consistent naming of the same thing
| throughout, not constructing variables or table names
| dynamically, etc. benefit both readers/maintainers and
| searching.
|
| I understood "grepping" to mean ripgrep (rg) or ack, not just
| plain grep. I think programmers who use command line tools or
| vim know about those. VSCode uses rg.
| lucideer wrote:
| Greppability is really a proxy metric here - these changes all
| have other benefits even if you never grep (mostly readability
| tbh). const getTableName = (addressType:
| 'shipping' | 'billing') => { return
| `${addressType}_addresses` }
|
| This is a simplified example but in a longer function,
| readability of the `return` lines would be improved as the
| reader wouldn't have to reference the union type (which may or
| may not be defined in the signature). The rewrite is also safer
| as it errors out if a runtime `addressType` value doesn't match
| the union type (above code would not throw an error, just
| return an indeterminate value which would cause undefined
| behaviour).
|
| "Flat is better than nested" also greatly improves readability
| in both examples: either reading the i18n line, or reading the
| classname at definition / call will be more readable when the
| name contains full context of function.
| Mikhail_Edoshin wrote:
| Conceptually it is akin to having file names that sort well.
|
| Grep is a simple tool, not too different from a simple string
| sort. It is better than no tool, but is it better than a tool
| that understands the notation? A strong side of grep is that it
| is universal and is not tied to a particular notation. Yet if you
| could easily define a specific notation and have a tool to
| immediately understand it, would you still prefer grep?
|
| We tend to organize the code according to the tools we have. E.g.
| if a tool gives us a list of entities in alphabetic order, we
| will try to name the entities so that they form "logical" groups.
| This may pass as a local organizational principle and may be
| useful but it is always intimately coupled with the underlying
| tool.
| eterevsky wrote:
| This also applies to dependency injection. While it has
| significant benefits, it hurts clarity of the code. It becomes
| more difficult to see where each object is coming from.
| mrkeen wrote:
| Magical dependency injection _frameworks_ , that is.
|
| Plain old putting-dependencies-in-the-constructor-instead-of-
| newing-them is great.
|
| If you 'wire' it yourself, you see the top-level structure of
| the project in main, e.g. cache <-
| createCache "./cache" workQueue <- createWorkQueue
| parallelism projectFinder <- createProjectFinder
| basePath gradleBuilder <- createGradleBuilder cache
| normaliser <- createNormaliser gradleParser <-
| createGradleParser normaliser relationFinder <-
| createRelationFinder cache normaliser
|
| At a glance I can see what uses normaliser, and what is used by
| normaliser.
| kgeist wrote:
| We've ditched "magical dependency injection frameworks" for
| manual injections in constructors and the net result is that
| development is actually faster. Yes, you write slightly more
| code, but at the same time, it's much easier to figure out
| what's going on. Before, if something went wrong, we would
| get some arcane errors from deep inside the DI framework and
| had to decypher what it means and what changes we need to
| apply to the configs. Now, it's just regular code (great with
| refactorings and greppability, too).
| skrebbel wrote:
| The second point here made me realize that it'd be super useful
| for a grep tool to have a "super case insensitive" mode which
| expands a search for, say, "FooBar|first_name" to something like
| /foo[-_]?bar|first[-_]?name/i, so that any
| camel/snake/pascal/kebab/etc case will match. In fact, I struggle
| to come up with situations where that _wouldn 't_ be a great
| default.
| WizardClickBoy wrote:
| This reminds me of the substitution mode of Tim Pope's amazing
| vim plugin [abolish](https://github.com/tpope/vim-
| abolish?tab=readme-ov-file#subs...)
|
| Basically in vim to substitute text you'd usually do something
| with :substitute (or :s), like:
|
| :%s/textToSubstitute/replacementText/g
|
| ...and have to add a pattern for each differently-cased version
| of the text.
|
| With the :Subvert command (or :S) you can do all three at once,
| while maintaining the casing for each replacement. So this:
|
| textToSubstitute
|
| TextToSubstitute
|
| texttosubstitute
|
| :%S/textToSubstitute/replacementText/g
|
| ...results in:
|
| replacementText
|
| ReplacementText
|
| replacementtext
| WizardClickBoy wrote:
| Also just realised while looking at the docs it works for
| search as well as replacement, with:
|
| :S/textToFind
|
| matching all of textToFind TextToFind texttofind TEXTTOFIND
|
| But not TeXttOfFiND.
|
| Golly!
| gen220 wrote:
| In vim, I believe there's a setting that you can flip to
| make search case sensitive.
|
| In my setup, `/foo` will match `FoO` and so on, but `/Foo`
| will only match `Foo`
| User23 wrote:
| The Emacs replace command[1] defaults to preserving UPCASE,
| Capitalized, and lowercase too.
|
| [1] https://www.gnu.org/software/emacs/manual/html_node/emacs
| /Re...
| tambourine_man wrote:
| Of course it does. Or it wouldn't be Emacs
| adammarples wrote:
| Fzf?
| setopt wrote:
| Fuzzy search is not the same. For instance, it might by
| default match not only "FooBar" and "foo_bar" but also e.g.
| "FooQux(BarQuux)", which in a large code base might mean
| hundreds of false positives.
| mgkimsal wrote:
| Ideally there'd be some sort of ranking or scoring that
| would happen to sort by. FooQux(BarQuux) would seemingly
| rank much lower then FooBar when searching for FooBar or
| "Foo Bar" but might still be useful in results if ranked
| and displayed lower.
| setopt wrote:
| Indeed, that's a good solution - and I believe e.g. fzf
| does some sort of ranking by default. The devil is
| however in the details:
|
| One minor inconvenience is that the scoring should
| ideally be different per filetype. For instance, Python
| would count "foo-bar" as two symbols ("foo minus bar")
| whereas Lisp would count it was one symbol, and that
| should ideally result in different scores when searching
| for "foobar" in both. Similarly, foo(bar) should ideally
| have a lower different score than "foo_bar" for symbol
| search even though the keywords are separated by the same
| number of characters.
|
| I think this can be accomodated by keeping a per-language
| list of symbols and associated "penalties", which can be
| used to calculate "how far" keywords are from each other
| in the search results weighted by language semantics :)
| hnben wrote:
| > "super case insensitive"
|
| lets say someone would make a plugin for their favorite IDE for
| this kind of search. How would the details look like?
|
| To keep it simple, lets assume we just do the super-case-
| insensitivity, without the other regex condition. Lets say the
| user searches for "first_name" and wants to find "FirstName".
|
| one simple solution would be to have a convention where a word
| starts or ends, e.g. with " ". So the user would enter "first
| name" into the plugin's search field. The plugin turns it into
| "/first[-_]?name/i" and gives this regexp to the normal search
| of the IDE.
|
| another simple solution would be to ignore all word boundaries.
| So when the user enters "first name", the regexp would become
| "/f[-_]?i[-_]?r[-_]?s[-_]?t[-_]?n[-_]?a[-_]?m[-_]?e[-_]?/i".
| Then the search would not only be super-case-insensitive, but
| super-duper-case-insensitive. I guess the biggest downside
| would be, that this could get very slow.
|
| I think implementing a plugin like this would be trivial for
| most IDEs, that support plugins.
|
| Am I missing something?
| inanutshellus wrote:
| IIUC, you're not missing anything though your interpretation
| is off from mine*. He wasn't saying it'd be hard, he was
| saying it should be done.
|
| * my understanding was simply that the regex would (A)
| recognize `[a-z][A-Z]` and inject optional _'s and -'s
| between... and (B) notice mid-word hyphens or underscores and
| switch them to search for both.
| __MatrixMan__ wrote:
| Shame on me for jumping past the simple solutions, but...
|
| If you're going that far, and you're in a context which
| probably has a parser for the underlying language ready at
| hand, you might as well just convert all tokens to a common
| format and do the same with the queries. So searches for foo-
| bar find strings like FooBar because they both normalize to
| foo_bar.
|
| Then you can index by more than just line number. For
| instance you might find "foo" and "bar" even when "foo = 6"
| shows up in a file called "bar.py" or when they show up on
| separate lines but still in the same function.
| skrebbel wrote:
| Hm I'd go even simpler than that. Notably, I'd not do this:
|
| > So the user would enter "first name" into the plugin's
| search field.
|
| Why wouldn't the user just enter "first_name" or "firstName"
| or something like that? I'm thinking about situations like,
| you're looking at backend code that's snake_cased, but you
| also want it to catch frontend code that's camelCased. So
| when you search for "first_name" you automagically also match
| "firstName" (and "FirstName" and "first-name" and so on). I
| wouldn't personally introduce some convention that adds
| spaces into the mix, I'd simply convert anything that looks
| snake/kebab/pascal/camel-cased into a regex that matches all
| 4 forms.
|
| Could even be as stupid as converting "first_name" _or_
| "firstName", _or_ "FirstName" etc into
| "first_name|firstname|first-name", no character classes
| needed. That catches pretty much every naming convention
| right? (assuming it's searched for with case insensitivity)
| specialist wrote:
| > _" first_name" or "firstName"_
|
| Ya. Query tokenizer would emit "first" and "name" for both.
| That'd be neat.
| marcosdumay wrote:
| The best way would be to make an escape code that matches
| zero or one punctuation.
|
| So you's search for "/first\\_name/i".
| Izkata wrote:
| That already exists as "?" and was used in their example:
| /first[-_]?name/i
|
| Or to use your example, just checking for underscores and
| not also dashes: /first_?name/i
|
| Backslash is already used to change special characters like
| "?" from these meanings into just "use this character
| without interpreting it" (or the reverse, in some
| dialects).
| kiitos wrote:
| It would be a mistake to try to solve this problem with
| regexes.
| dominicrose wrote:
| Let's say you have a FilterModal component and you're using it
| like this: x-filter-modal
|
| Improving the IDE to find one or the other by searching for one
| or the other is missing the point or the article, that
| consistency is important.
|
| I'd rather have a simple IDE and a good codebase than the
| opposite. In the example that I gave the worst thing is that
| it's the framework which forces you do use these two names for
| the same thing.
| skrebbel wrote:
| My point is that if grep tools were more powerful we wouldn't
| _need_ this very particular kind of consistency, which gives
| us the very big benefit of being allowed to keep every part
| of the codebase in its idiomatic naming convention.
|
| I didn't miss the point, I disagreed with the point because I
| think it's a tool problem, not a code problem. I agree with
| most other points in the article.
| boxed wrote:
| I think Nim has this?
| archargelod wrote:
| Nim comes bundled with a `nimgrep` tool [0], that is
| essentially grep on steroids. It has `-y` flag for style
| insensitive matching, so "fooBar", "foo_bar" and even
| "Foo__Ba_R" can be matched with a simple "foobar" pattern.
|
| The other killer feature of nimgrep is that instead of regex,
| you can use PEG grammar [1] [0] -
| https://nim-lang.github.io/Nim/nimgrep.html [1] -
| https://nim-lang.org/docs/pegs.html
| Groxx wrote:
| fwiw I pretty frequently use `first.?name` - the odds of it
| matching something like "FirstSname" are low enough that it's
| not an issue, and it finds all cases and all common separators
| in one shot.
|
| (`first\S?name` is usually better, by ignoring whitespace ->
| better ignores comments describing a thing, but `.` is easier
| to remember and type so I usually just do that)
| msmolkin wrote:
| Hey, I just created a new tool called Super Grep that does
| exactly what you described.
|
| I implemented a format-agnostic search that can match patterns
| across various naming conventions like camelCase, snake_case,
| PascalCase, kebab-case. If needed, I'll integrate in space-
| separated words.
|
| I've just published the tool to PyPI, so you can easily install
| it using pip (`pip install super-grep`), and then you just run
| it from the command line with `super-grep`. You can let me know
| if you think there's a smarter name for it.
|
| Source: https://www.github.com/msmolkin/super-grep
| dang wrote:
| You should post this as a Show HN! But maybe wait a while
| (like a couple weeks or something) for the current thread to
| get flushed out of the hivemind cache.
|
| If you do, email a link to hn@ycombinator.com and we'll put
| it in the second-chance pool
| (https://news.ycombinator.com/pool, explained at
| https://news.ycombinator.com/item?id=26998308), so it will
| get a random placement on HN's front page.
| msmolkin wrote:
| Wow, thanks so much for the encouragement and advice, dang!
| I'm honored to receive a personal response from you and so
| soon after posting. I really appreciate the suggestion to
| post this as a Show HN. If I end up doing it, I'll
| definitely wait a bit-thanks for that suggestion, as I
| would have thought to do the opposite otherwise. Nice of
| you to offer to put it in the second-chance pool as well.
| skrebbel wrote:
| wow this is so cool!! it feels super amazing to dump a random
| idea on HN and then _somebody makes it_! i 'm installing
| python as we speak just so i can use this.
| rldjbpin wrote:
| pretty cool and to me a better approach than the prescriptive
| advice from the OP. to me the crux of the argument is to make
| the code more readable from a popular tool. but if this can
| be well-integrated into common ide (or even grep perhaps), it
| would take away most of the argument down to personal
| preference.
| crazygringo wrote:
| Adding to that, I'm often bitten trying to search for user
| strings because they're split across lines to adhere to 80
| characters.
|
| So if I'm trying to locate the error message "because the disk
| is full" but it's in the code as: ... + "
| because the " + "disk is full")
|
| then it will fail.
|
| So really, combining both our use cases, what would be great is
| to simply search for a given _case-insensitive alphanumeric
| string_ in files that _skips all non-alphanumeric characters_.
|
| So if I search for: Foobar2
|
| it would match all of: FooBar2 foo_bar[2]
| "Foo " + \ ("bar 2") foo.bar.2
|
| And then in the search results, even if you get some accidental
| hits, you can be happy knowing that you didn't miss anything.
| lathiat wrote:
| These are both of the problems I regularly have. The first
| one I immediately saw when reading the title of this
| submissionw as the "super case insensitive" that I often see
| when working on Go Codebases particularly when using a
| combination of Go Classes and YAML or JSON. Also happens with
| command line arguments being converted to variables.
|
| But the string split thing you mentioned happens a lot when
| searching for OpenStack error messages in Python that is
| often split across lines like you showed. My current solution
| is to randomly shift what I'm searching for, or try pick the
| most unique line.
| t43562 wrote:
| _astgrep_ is a very useful tool when grep fails: https://ast-
| grep.github.io/
|
| It's not as easy to use as grep but I think one can script it to
| be nearly so. It has huge power but without learning it all one
| can do searches that grep finds difficult. e.g. finding all the
| locations where a method is called and showing the parameters
| even if they are on multiple lines. You can then use rewrite
| rules to do CLI code refactoring.
|
| I think it also has potential in a build toolchain e.g. to look
| for patterns you want to discourage as a pre-commit hook.
|
| _ultragrep_ - https://github.com/zendesk/ultragrep - I don't
| love this quite as much but it does have a way to build indexes
| so you can do fast greps across a big codebase. It also has a
| text mode UI if you want it and I find that almost worthwhile.
|
| I use _ripgrep_ most of the time but while I like it, there is a
| limit to how many grep tools I can remember and I should probably
| cut down to using ultragrep and astgrep.
|
| plain gnu grep itself is something one has to know when one is on
| an unfamiliar machine.
| leetrout wrote:
| I encourage my teams to write logs / output with interpolation
| with the variables at the end for searchability
|
| For example: Added %d users
|
| Vs: Added users (%d)
|
| Then it is much easier to track down where things come from
| without needing wildcards in the search or to care too much about
| what might be dynamic in cases where its not obvious.
| davemp wrote:
| I've basically landed on the following form: 'Short
| description. [foo={}, bar={}]'
|
| Which will give you grepability and in theory parsability so
| you can automatically bisect for a value change or something
| along those lines.
| leetrout wrote:
| Indeed. That is basically what the logging library from charm
| bracelet does.
| medstrom wrote:
| I like it, but it may be painful in languages where long
| variable names are common.
| jgrahamc wrote:
| In my very first job I wrote a spell checker/corrector for code
| comments. This was specifically to make greppability possible
| because some of my colleagues were appalling at spelling and it
| meant that the incredibly detailed comments we used to write were
| hard to search for key details.
| assanineass wrote:
| Sounds a little erotic...
| wrsh07 wrote:
| Nice article. Two notes:
|
| First, some of these suggestions will make it harder to introduce
| bugs when updating the code. That's good! Particularly tricky is
| when somebody splits up identifiers or function names. These
| types of things often occur at boundaries (calls between servers
| or to the db) which can make them tricky to test. Even if all
| your identifier combining is initially done in a single file,
| it's easy for someone to see the final shape of the identifier
| and accidentally hard code it somewhere else.
|
| Second, In the spirit of Titus Winters' "software engineering is
| programming over time", a codebase should be greppable over time.
|
| That means that if you rename a function, you might consider
| saving the old name of the function in a comment.
| hugodan wrote:
| Try to keep code reference indirections to a tasteful minimum. If
| a split is needed that's one more indirection for whoever will be
| maintaining it in the future. This weight needs to be on the
| table.
|
| Keeping things referentially transparent helps a lot here.
| eitland wrote:
| Working in spring, which accepts I don't know how many formats
| from ENV_VARS to yaml, this very much resonates with me, because
| as a general rule, if one can use a certain option, someone will
| do it.
|
| Also the reason why I try to avoid Gradle when possible:
|
| The possibilities are endless. At one place I think I found 21
| wildly different Gradle configs out of 24 that I checked.
|
| (For anyone that wonders, it was combinations of:
|
| - placeholders vs straightforward depency (this is a thing in
| maven too)
|
| - for loops doing things based on lists or maps instead of just
| calmly declaring them one after another, maybe to save some
| characters
|
| - helper functions so you could declare dependencies like
| azure(<something>(<version>))
|
| - order of declarations
|
| - Kotlin vs Groovy syntax
|
| I have probably forgotten a couple more but this is thankfully
| already a few years ago.)
| r34 wrote:
| Good point. I would refer to another (similar) metric, which
| could be called "IDE-search-ability): it extends greppability, by
| adding some more conventions which work well with your (your
| company's) IDE.
| indymike wrote:
| Greppability adn debugability are two things that I look for in
| code reviews. If you ask, "How would you debug that?" and the
| answer stats with, "I'd rewite it to..." Maybe, just maybe you
| should write it that way.
| 0x69420 wrote:
| sure. _the_ reason i put a line break between return type and
| function name in c-likes is `grep ^fname`. but i seriously wish
| greppability _wasn 't_ important. the extensive line-orientedness
| of unix tools really puts a damper on the whole hose-of-bytes
| concept, and it's no wonder by the time of plan 9, there was a
| strong desire to do away with it--cf. "structural regular
| expressions", as deployed in sam(1), which, of all the places to
| put them, certainly has historical irony, as sam's (decidedly
| _not_ line-oriented) editing language nonetheless descends from
| ed, the definitive _line editor_ , and gave us such hits as
| "stream ed" and "simulate typing `g/regex/p` into ed".
|
| just the other week i noticed a change in recommended formatting
| style in a project i contribute to regularly, and the result was
| source files got about 20% taller, 20% more of a pain in the ass
| to edit without some sort of syntax folding. the rationale? diff.
| making you reach for a syntax-aware editor to compensate for a
| deficiency in the syntax-awareness of a version control frontend
| is certainly a choice.
|
| the business end of git as seen by most programmers is in fact
| diff city, sure, but deep down git is a bunch of snapshots. even
| deltas behave nothing like diffs. pull up the spec for the pack
| format and look for the word "line". you will not find it.
|
| things could be so much better, but for now we live in a world
| where the headline is true.
| nickjj wrote:
| Absolutely.
|
| It's what I like about Rails when it comes to file names too.
| Having _controllers /users_controller.rb_ as a path might sound
| wasteful because "you're already in the controllers directory,
| you don't need _controllers in the path".
|
| But when you want to fuzzy find that file, it's really nice to
| type "users con" and get that file instead of also picking up
| views, models and other user related files with just a "users"
| search.
| trilbyglens wrote:
| Imo this is another big selling point of using tailwinds css.
| Those log stacks of classes become almost like UUIDs for markup
| that's discoverable from dev tools.
| ceritium wrote:
| I built the command line tool flatito just for the Rails i18n
| translations keys.
|
| I am unsure if I like the author's approach because there are
| other cons, but it's a good point.
|
| * https://github.com/ceritium/flatito
| klysm wrote:
| A good IDE makes up for this if the syntax of the language
| doesn't lend itself to easy greppage. I lean heavily on JetBrains
| search with their editors
| frabjoused wrote:
| This is why I've always fought against BEM in CSS. Tends to drive
| greppability to zero.
| dwh452 wrote:
| This sounds like the advice to prefer the variable name 'ii' over
| 'i' because you can easily search for it. I loath such advice
| because it causes the code to become ugly. Similarly, there are
| 'YODA Conditions' which make code hard to comprehend which solves
| an insignificant error that is easily caught with tooling. The
| problem with advice like these is you will encounter deranged
| developers that become obsessive about such things and make the
| code base ugly trying to implement dozens of style rules. Code
| should look good. Making a piece of text look good for other
| humans to comprehend I consider to be job #1 or #2 for a good
| developer.
| moolcool wrote:
| > The problem with advice like these is you will encounter
| deranged developers that become obsessive about such things and
| make the code base ugly trying to implement dozens of style
| rules
|
| That's more of a "deranged developer" problem than a problem
| with the guidelines themselves. E.g. I think his `getTableName`
| example is quite sensible, but also one which some dogmatic
| engineers would flag and code-golf down to the one-liner.
| ajuc wrote:
| > 'ii' over 'i'
|
| You don't need to search for local variables, nobody names
| global variables "i" - so the "ii" advice is pointless.
|
| You often do need to search for places where global stuff is
| referenced, and while IDEs can help with that - the same things
| that break grepability often break "find references" in IDE.
| For example if you dynamically construct function names to
| call, play with reflections, preproccessor, macros, etc.
|
| So it's a good advice to avoid these things.
|
| > you will encounter deranged developers that become obsessive
| about such things and make the code base ugly
|
| You can abuse any rule, including
|
| > Code should look good.
|
| and I'd argue the more general a rule is - the more likely it
| is to be abused. So I prefer specific rules like "don't
| construct identifiers dynamically" to general "be good" rules.
| inetknght wrote:
| > _This sounds like the advice to prefer the variable name 'ii'
| over 'i' because you can easily search for it_
|
| I've never heard of that advice. I honestly like algebraic
| names (singular digits) as long as they're well documented in a
| comment or aliasing another longer-name.
|
| > _there are 'YODA Conditions' which make code hard to
| comprehend which solves an insignificant error that is easily
| caught with tooling_
|
| Yoda conditions [0] are a useful defensive programming
| technique and does not reduce readability except to someone new
| to it. I argue it improves readability, particularly for
| myself.
|
| As for tooling... it doesn't catch every case for every
| language.
|
| > _I loath such advice because it causes the code to become
| ugly._
|
| Beauty is in the eye of the beholder. While I appreciate your
| opinion, I also reject it out of hand for professional
| developers. Instead of deciding whether code is "ugly" perhaps
| you should decide whether the code is useful. Feel free to keep
| your pretty code in your personal projects (and show them off
| so you can highlight how your style really comes together for
| that one really cool thing you're doing).
|
| > _you will encounter deranged developers that become obsessive
| about such things_
|
| I don't like being called deranged but I am definitely obsessed
| about eliminated whole classes of bugs just by the coding
| design and style not allowing them to happen. If safe code is
| "ugly" to you... well then I consider myself to be a better
| developer than you. I'd rather have ugly code that's easily
| testable instead of pretty code that's difficult to test in
| isolation which most developers end up writing.
|
| > _Code should look good. Making a piece of text look good for
| other humans to comprehend I consider to be job #1 or #2 for a
| good developer._
|
| It depends on the project. Just remember that what looks good
| to you isn't what looks good to me. So if it's your personal
| project, then make it look good! If it's something we're both
| working on... then expect to defend your stylistic choices with
| numbers and logic instead of arguments about "pretty".
|
| Then, from the article:
|
| > _Flat is better than nested_
|
| If I'm searching for something in JSON I'm going to use jq [1]
| instead of grep. Use the right tools for the right job after
| all. I definitely prefer much richer structured data instead of
| a flat list of key-value pairs.
|
| [0] https://en.wikipedia.org/wiki/Yoda_conditions
|
| [1] https://en.wikipedia.org/wiki/Jq_(programming_language)
| antifa wrote:
| > the advice to prefer the variable name 'ii' over 'i' because
| you can easily search for it
|
| \bi\b is the easy way to search for i.
| marcosdumay wrote:
| Those things only make the codebase "ugly" until you learn how
| to read it.
| Aachen wrote:
| > prefer the variable name 'ii' over 'i'
|
| Vim users don't have this issue, where you can * a variable
| name you're looking for and that'll enforce word boundaries.
|
| PHP developers also don't have this issue: the $ before is a
| pretty sure indicator you're working with a variable named i
| and not a random i in a word or comment somewhere
|
| Y'all should just use proper tools. No newfangled Rust and
| Netbeans that the kids love these days, the 80s had it all!
|
| (Note this is all in jest :) I do use vim and php but it's
| obviously not a reason to use a certain language or editor; I
| just wondered to myself why I don't have this problem and
| realised there's two reasons.)
| poikroequ wrote:
| Grep is nice, but I would much prefer better tools for searching
| through code. Something that knows how to parse multiple
| languages and can infer the types of things. Not to mention
| indexing, for large code bases, grep'ing through possibly
| millions of lines of code can be awfully slow.
|
| IDEs do a decent job but are typically lacking compared to the
| raw power of grep.
| packetlost wrote:
| I mean, I prefer faster symbol, type, etc. based navigation
| too, but it doesn't work in all scenarios so grep is an
| extremely handy fallback.
| Timwi wrote:
| All of the apostrophes in this article are wrong. The correct
| character is ', but this article uses ' (open single quote)
| throughout.
| riz_ wrote:
| Thanks, fixed.
| svennidal wrote:
| This approach along with ack, instead of grep, has been a godsend
| to me.
| pooriar wrote:
| I just shared this in the work Slack and everyone resoundingly
| agreed with the sentiment. Definitely going to pay more attention
| to this now, thanks for sharing!
| trey-jones wrote:
| As someone who almost exclusively uses grep for finding what I
| need in codebases that are new to me and old to me, you can make
| whatever arbitrary rules you want, as long as you're consistent,
| I'll be pretty happy with it. If syntax is loose in some area
| (single vs double quotes, parens or braces or none), just do the
| same thing every time. Whitespace consistency isn't crucial, but
| it can't hurt (between function name and parens, for example).
| necrotic_comp wrote:
| Agreed. So long as the code hits performance and business
| goals, there doesn't need to be an emphasis put on "newness" or
| any other sort of vanity metric - make the code obvious,
| searchable, and understandable so that in a time crunch or
| during an outage it's easy to search and find the culprit.
| causal wrote:
| I'm also thinking long-context LLMs are going to make this
| advice seem pretty archaic in a few years. They're so good at
| reading code and extremely useful for asking questions of a
| code base.
|
| That said, I completely agree with the author on not using
| clever string tricks to compose identifiers. That makes code
| both harder to search and to read.
| mannycalavera42 wrote:
| if only there was a language where code is data so that... hold
| on a sec! #LISP-languages
| guhcampos wrote:
| One situation that comes to mind is configuration of applications
| on containers using environment variables.
|
| It's extremely valuable to be able to just `grep -r PREFIX_` on a
| codebase and be able to visualize all possible configuration
| values for that application.
|
| This is encouraged by some frameworks like Django, where you are
| expected to list all the configuration values in a `settings`
| module, but is not standard for `viper`, `click` and `pydantic-
| settings`, which try to be too smart and auto-generate the
| variable names for you. It's one of these cases where "modern"
| frameworks and applications try to save a minuscule amount of
| work by automating some task, but end up reducing the
| maintainability of the code over time.
| Shorel wrote:
| I use grep and git grep all the time.
|
| This post is very welcome, it sums up my own ideas about grep in
| a better way.
| jayd16 wrote:
| I'm a big proponent of visual scripting (where it makes sense)
| but you really do miss the text-based tooling like grep.
|
| One trade-off you can make is using text-based serialization so
| you're at least able to grep the yaml or JSON or whatever and get
| to the right file at least. This of course costs you some editor
| load time.
|
| On the flip side you're basically always using an IDE to edit the
| visual script. In theory symantic search should be possible and
| built in although reality usually falls short.
|
| Someone in a previous HN thread mentioned the idea of a standard
| graph syntax. Something that game engines and tools could store
| their graph-based assets in. If there was a standard syntax then
| standard tools could be made and we could end up seeing something
| like a graph grep. One one could imagine a visual studio graph
| editor type app with plug-in support. Even a standard merge tool
| would be a huge step up for non-text-based code assets.
|
| A man can dream!
| wizzwizz4 wrote:
| There _is_ a standard graph syntax: Graphviz DOT notation.
| jayd16 wrote:
| I'm not familiar with this but it seems more geared towards
| visually representing flow charts but it doesn't have the
| necessary verbosity for visual scripting. I might be wrong
| but I think this is only part of the puzzle. Can rules be
| defined such that, for example, a node port must be pointed
| to, or that a default value is available?
|
| Besides the declaration of a graph, there also needs to be a
| way to define semantics, ie something akin to a Language
| Server Protocol. That way tooling can enforce validity while
| editing a graph.
| wizzwizz4 wrote:
| DOT is not a schema language, no.
|
| It sounds like you're trying to make something new, in
| which case: I'd forego standardisation concerns for now,
| and focus on getting something that works, and where future
| prototyping can be backwards-compatible.
| nottorp wrote:
| Funny, I started work on a legacy code base a couple months ago
| and yes, it has all the problems described in the article and
| that hinders our understanding of it.
| j45 wrote:
| Code that is not written for others in the future may have a
| limited future
| settsu wrote:
| I'm ardently in favor of making code human readable as
| practically as possible. Personally it follows on my personal
| Rule #1: Be kind (i.e., in this case, to others and your future
| self.)
|
| However, searchable =/= greppable.
|
| > Flat is better than nested
|
| Context matters but, generally speaking, I would say that flatter
| is _anti_ -grep.
| breck wrote:
| There's a new kind of language where the practice is to use
| whitespace, and only whitespace, as your syntax. Newlines
| separate blocks and spaces separate words.
|
| One of the unexpected extremely powerful things this allows is
| finding function usage extremely easy in any text editor that
| supports regex. You just search for ^[functionName] . Since you
| know that function pretty much will only be used at the beginning
| of lines. You can thus make edits against the AST with regexes
| and without parsing the AST at all.
|
| It's pretty amazing, and leads to quite faster development, and
| allows one to tackle bigger and more complex problems.
| welder wrote:
| Also import traceability
| hilux wrote:
| Digital marketers have known this for a long time.
| HeavyStorm wrote:
| Only read the title but if it's what I guess it is, then I
| finally met someone who'll understand that I always declare
| functions in js using the keyword function.
| tlb wrote:
| An editor feature I'd like is that as I'm typing an identifier, a
| hover popup shows me how many other instances appear in my
| codebase. It should be easy to build a map of identifier->count
| for instant lookups. I generally know if I want 0 (a new unique
| identifier), 1, a small number, or a large number. For a few
| pixels, this would prevent a lot of dumb mistakes and ambiguous
| names.
| jgalt212 wrote:
| Not to pick on Black, as it happens with all code formatters, but
| max line length rules kill greppability.
|
| > Black defaults to 88 characters per line
|
| https://black.readthedocs.io/en/stable/the_black_code_style/...
| linuxdude314 wrote:
| The examples are pretty silly, especially the first one.
|
| If you know that you need either a shipping or billing address
| and the user has specified which one they need, just query based
| on that.
|
| There's no need to introduce a function (getTableName) to
| detemplate a string or match on a case.
|
| Instead just create a function that gets the item you want from
| the DB and has the table name as input.
|
| On your UI make sure when the users specifies billing or shipping
| address the correct parameter is passed to the API.
| KronisLV wrote:
| I work on a project where people decided to refer to translations
| by doing the equivalent of:
| :label="$translate(getProductSectionLabel('title'))"
|
| where the logic is a bit like: const
| getProductSectionLabel = (code) =>
| `myapp.sales.sections.products.${code}`
|
| and then the actual values are in a nested structure, like:
| myapp: { sales: { sections: {
| products: { title: "Products" ...
| } ... } ... } ...
| }
|
| People seem to have gone for that because writing that first part
| is simpler within the component, but I couldn't get across that
| this makes the codebase harder to navigate.
|
| Meanwhile, my personal codebases are more like:
| :label"$translate('myapp-sales-products-title')"
|
| and the translation file also has the equivalent of:
| myapp-sales-products-title: "Products"
|
| which is way simpler at the expense of some more duplication
| (easily mitigated by compressing the translations).
| advael wrote:
| For my purposes, among the most important
|
| Even a major refactor is relatively easy if you can find stuff in
| your codebase. Even a small bugfix can get complicated if there's
| a ton of ambiguity
| Groxx wrote:
| Along similar lines: I _highly_ recommend making _every_ metric
| and log in your system spelled out completely somewhere.
| Don't: base = "abc" something = base +
| ".some.suffix" Do: something =
| "abc.some.suffix"
|
| I've also had some luck with hard-coded UUIDs at call sites,
| e.g.: log.Info("something", "callsite",
| "DECAFBAD-000...")
|
| because it makes it _absolutely trivial_ to find a log, and
| unlike caller-lines (which are great! use them too!) it doesn 't
| change when you refactor code.
| germandiago wrote:
| This is one of the top reasons why I prefer explicit over
| implicit.
| dang wrote:
| Yes, and 'greppable' is an underused word/concept in its own
| right.
|
| I've used this as an organizing principle since forever
| (https://news.ycombinator.com/item?id=1535916). It's one of the
| best ways to factor code that I know of.
| tptacek wrote:
| It's good, but I'm snagging on the examples here; for instance,
| the insistence on consistent use of camel or snake case. That
| seems to me to be _purely_ a tooling limitation; a more code-
| aware grep could just automatically synthesize the covering
| regexp for both. It feels like there 's a sweet spot in between
| LSP-ism and grep-ism that isn't being captured.
| tabbott wrote:
| Greppability is the reason why I feel it's really quite
| unfortunate that HTML's dataset attribute went with
| "canonicalizing" to camel case; if you're using any non-camel-
| case data attribute names in your project, then they immediately
| become not fully greppable when making use of `dataset`.
|
| https://developer.mozilla.org/en-US/docs/Web/API/HTMLElement...
| peter_d_sherman wrote:
| Observation:
|
| The more broad understanding here is that in computer programming
| there is a difference between a _symbol_ and a _referent_
| (https://open.maricopa.edu/com110/chapter/3-1-language-and-
| me....) AKA _variable_ and its _value_.
|
| (Phrased colloquially: "The finger pointing at the Moon is not
| the Moon.")
|
| Information about a given computer program may be found in its
| symbols (variable names) _and_ its referents (values that
| variables are set to).
|
| The referents may be set in source code (i.e. a string or other
| constant) and be visable/greppable at design-time -- and/or they
| may be set at runtime (and be non-greppable!)
|
| The article is good; I have nothing against it, it makes an
| excellent point.
|
| But if we look at what I've outlined above as three possible
| places for information (source symbol, source referent, runtime
| referent), then we observe that one of these possibilities is not
| covered by grep -- that is, runtime referent.
|
| This leads to the question:
|
| "What if grep could be used to search for runtime referents, aka
| runtime values of different variables?"
|
| Well, in traditional compiled languages, we can't do that without
| significant monkeying with the language...
|
| In interpreted and Lisp-like languages, yes, the above is
| possible -- but without being able to be very specific about the
| what/how/when of the runtime strings (or more broadly, values) to
| be grepped, doing the above could generate huge amounts of data
| and speed degradation, particularly if a value is set and reset
| multiple times in a long loop!
|
| But could it be done? Definitely! Efficiently? Well, that's one
| of the questions! Deterministically? If random values are in
| play, probably not. With full code coverage so that values can be
| known from code paths not taken can be known? That scenario would
| be challenging to say the least.
|
| Point is, it may be an interesting future language feature to
| consider in various scenarios, for future language designers...
|
| "The ability to grep through all runtime referents, aka all
| runtime values, of all variables, for a given runtime of a
| program..."
|
| Hmmm... now that I think about it, _what if_ a language was
| created such that you could pass it a special flag, if you passed
| it this special flag, then variables would maintain their
| histories in memory (at great expense to performance -- this
| would be for debugging only!), and then you could grep that
| memory?
|
| Yes, I know... given the pace of technology, there probably is
| some system or systems that might implement some aspect or
| aspects of this...
|
| rr comes to mind:
|
| https://rr-project.org/
|
| https://news.ycombinator.com/item?id=31617600
|
| (Maybe the rr maintainers could be persuaded to implement some
| interface with grep, either pre or post program run, or maybe
| implment a timer where the user can grep their program for
| various strings at various time intervals...)
|
| Anyway, good article!
| kazinator wrote:
| In C, the word referent is used for the thing a pointer points
| to. That's your main ungreppable run-time referent.
|
| > "The ability to grep through all runtime referents, aka all
| runtime values, of all variables, for a given runtime of a
| program..."
|
| It sounds entirely doable in a garbage-collected run-time. We
| "stop the world" and then run something similar to the marking
| pass of a full garbage collection cycle, which looks for
| matches for a pattern in the entire space of reachable objects.
| The routine could maintain a path from root pointer to object
| (easily done on the recursion stack) so that it's reported
| along with matches.
| peter_d_sherman wrote:
| Sounds good!
|
| >"The routine could maintain a path from root pointer to
| object (easily done on the recursion stack) so that it's
| reported along with matches."
|
| Yes, it definitely could!
|
| For extra points (if the additional functionality is
| desired!), also create an API interface for this
| functionality for coding AI's, present and future, such that
| they can also perform this form of grepping automatically
| when needed, for example, when running/debugging a program
| they wrote, or are assisting a programmer with making changes
| to...
|
| (Observation: There seem to emerge two different main
| patterns with respect to variable inspection... One is to set
| a watchpoint on a variable and stop the program when the
| variable changes.
|
| Another would be to globally scan all variables (i.e. grep)
| at specific intervals (a form of a batch pattern) for
| specific strings/values, and all of that would be/could be --
| settable by the programmer and/or an AI...)
|
| Anyway, your idea is a good one!
| kazinator wrote:
| Objects in heaps can be reached in more than one way. We
| don't want to traverse any object twice, because we will
| loop, so we must use the GC marked bit.
|
| Yet! There is a way to report all the paths by which an
| grepped-out object is reachable.
|
| Say we encounter an interesting object O for the first
| time, at path A B C. So we have A.B.C.O.
|
| We keep objects A, B and C in a hash table. The hash table
| represents nodes we have visited, which are on the path to
| a hit.
|
| Then say we encounter object B again, at path X Y Z B. B is
| a hit in our hash table, and so we report X.Y.Z.B.C.O as
| another hit for object O, without descending into B again.
| mkoubaa wrote:
| This is why I try to avoid code reflection in cases where there's
| an alternative, even when that alternative is less elegant
| jrd259 wrote:
| The ability to search the code base is one reason I insist on
| correct spelling _even in comments_ when doing code reviews, and
| also keep adding to my IDE 's dictionary so it will catch _my_
| spelling errors.
| codedokode wrote:
| CSS preprocessors are often used in a way that hurts
| "greppability". For example, SCSS developers often break
| identifiers like .product { &-promo
| {
|
| I always felt it was a bad practice but nevertheless it is often
| used. Glad to find out that I am not the only one who noticed
| this.
|
| Of course, one might say that I should have used a specialized
| tool for this. But it would be more convenient if I could search
| anything with text search. Also, there is no tool that allows to
| search over identifiers that are calculated in runtime in a
| program.
| krawczstef wrote:
| Yep. When I was designing https://github.com/dagworks-
| inc/hamilton part of the idea was to make it easy to understand
| what and where. That is, enable one to grep for function
| definitions and their downstream use easily, and where people
| can't screw this up. You'd be surprised how easy it is to make a
| code base where grep doesn't help you all that much (at least in
| the python data transform world) ...
| eschluntz wrote:
| Totally agree! I wrote this a few years ago:
| https://blog.cobaltrobotics.com/you-cant-read-code-that-you-...
| maaaaattttt wrote:
| From a true e-commerce story. If you have to somehow deal with
| product conditions, don't use the condition "new" for new vs.
| mint, used, damaged products. This has a nightmarish
| greppability... I'll never do that mistake again.
| kmoser wrote:
| The example for "Don't split up identifiers" is impractical when
| you need to construct a string on the fly. Author's example might
| work for the simple case of two potential values, but as soon as
| addressType can take on more than a trivial number of values, the
| code becomes very non-DRY.
|
| Also, the example completely glosses over the egregious use of
| magic constants. Wouldn't it be better to declare addressType an
| enum?
|
| The example for "Use the same names for things across the stack"
| only works if you control the whole stack. Sometimes you have to
| get fetch data from a 3rd party. And even if you do own the whole
| stack, sometimes it's nice to be able to search for 'streetName'
| when you're looking for the JS symbol, and 'street_name' when
| you're looking for the DB column. So I don't think this rule is
| always beneficial.
|
| At the end of the day, if you really want to make your code
| grepable, add some verbose comments that include useful keywords.
| (I've found it useful to include keywords that aren't even in the
| code but which devs may search for.) This lets you grep
| regardless of how the code is written or formatted.
___________________________________________________________________
(page generated 2024-09-04 23:01 UTC)