[HN Gopher] Excel as Code
___________________________________________________________________
Excel as Code
Author : LukeEF
Score : 107 points
Date : 2021-09-20 15:51 UTC (7 hours ago)
(HTM) web link (github.com)
(TXT) w3m dump (github.com)
| mongol wrote:
| A killer app would be a spreadsheet format that worked as well as
| source code as storage format for a spreadsheet application.
| Something that was designed for manual editing in two different
| ways, in text editors and in a "cell editor". That would support
| all version control use cases that developers are familiar with
| and that have been best practice for decades.
|
| Perhaps all that is needed is to port OpenOffice to the sc format
| (and extend it in the spirit it works now)
| sneak wrote:
| This might be able to be achieved with a new serialization
| format for xls files. Something line-based, with canonicalized
| sorting of cells.
| [deleted]
| breck wrote:
| At my last job at OurWorldInData we made something like this.
| One of the head researchers would build sophisticated
| spreadsheets containing all the transformations and views our
| users could do, and rather than re-implement that logic in
| Typescript, we saved it as TSV and built a spreadsheet editor
| for the researchers to use. From a code perspective it was just
| a tree to traverse.
|
| Demo: https://www.youtube.com/watch?v=0l2QWH-iV3k
|
| Changes in the spreadsheet UI then work really well with git.
| For example: https://github.com/owid/owid-
| content/commit/37ef12d65655fa14...
| badhombres wrote:
| I truly believe that Excel is the most abused software of all
| time. It has been mangled, malformed, smashed, and manipulated to
| do stuff that I don't believe the creators ever intended. Adding
| a scripting capability to it has unlocked the spirit of challenge
| in all SME's of finance related fields to make Excel the sole
| software they will use for all problems.
| ivyirwin wrote:
| I get the spirit of the document, but disagree with the goal. I'm
| biased, I've kind of made my career writing web applications for
| people reliant on Excel. While I've come to respect it's power -
| I had a colleague in architecture school design buildings using
| excel and I've seen some ridiculous formulas based on crazy pivot
| tables and conditionals.
|
| I've seen more spreadsheets than I would care to admit, and what
| drives me crazy about each and everyone is that it is not readily
| apparent where the work is being done. I think you could say the
| same about a "programming language" except that the programming
| language is usually not also the product. When the interface is
| the code and the output, the lack of consistent implementation is
| something I find frustrating.
|
| It's a nice thought experiment, but in my mind I think the world
| would be a better place without excel.
| q-rews wrote:
| I don't know why you would say that.
|
| Excel is democratizing tool for programming. It's a true
| WYSIWYG for databases, calculations, plotting, and more. And
| it's just a regular app that every PC has.
|
| Everyone needs a table. Hey did you know your table can do math
| automatically? It actually can fetch live forex data too. And
| infinitely more.
| jrm4 wrote:
| Honestly, I think nearly the exact reverse -- the world would
| be a better place with more _Excels_ and fewer "languages as
| we think of them."
|
| Separation of "developer" and "user" is artificial and more
| should be done to recognize that.
| scoopertrooper wrote:
| In my professional life, I've come to understand the
| difference between a developer and a user is the degree to
| which they can think about a problem riggeriously. Most users
| work at a very high level, a developer helps them break
| problems down and think about things concretely.
| swypych wrote:
| "It's a nice thought experiment, but in my mind I think the
| world would be a better place without excel. "
|
| I agree with most everything you said, however, proliferation
| of programming and automation is a net win in my books, no
| matter the medium, and good spreadsheet software does this
| incredibly well. It makes programming in its very basic form
| accessible to a wide amount of users with a relative gradual
| and easy to grasp learning curve. Sure you can always improve
| on it, but I think the world would most definitely not be
| better off without it.
|
| I do agree that the work is hidden, they can be a nightmare to
| audit, and I think it would scare a lot of people on this board
| the amount of business critical functions that are completed by
| excel and other spreadsheets. However, I like to think this a
| short term problem, and to the authors point, the industry and
| the sw needs and will improve, and we should all be trying to
| eventually close the gap.
| qsort wrote:
| > When the interface is the code and the output, the lack of
| consistent implementation is something I find frustrating
|
| This is the reason why spreadsheets are popular in the first
| place, though. I won't ever defend them - I'm on a project
| right now that's been working on Excel for years, I know the
| pain! - but this is something that's worth thinking about.
|
| See also Jupyter Notebooks, yet another invention from the deep
| pits of hell. The popularity of the interactive paradigm is
| undeniable. Would the world be better if everyone started using
| something sane instead? Definitely so. But the world would also
| be better if every day was Christmas and that's not going to
| happen either.
|
| So while I share most of your concerns, I'm mostly sympathetic
| with the OP.
| FpUser wrote:
| >"It's a nice thought experiment, but in my mind I think the
| world would be a better place without excel."
|
| I stopped using excel after this whole subscription madness
| started and switched to native Softmaker Office. I keep
| countless small spreadsheets for various money related tasks
| and absolutely not prepared to spend any time / effort on doing
| it "the right way". My brain cells are much better off working
| on software design (the stuff that actually makes me money).
| croes wrote:
| >What needs to change is the idea that they are not programmers,
| so they can join us in using modern software practices.
|
| Most of them don't want to use modern software practices, the
| want their formulas and their macros no matter the security
| risks. They don't remove unnecessary code because they don't want
| to read and learn what others had done before in the spreadsheet.
| Excel is easy and successful because you don't need to follow any
| software practice in the first place and that's also the reason
| why it's a pain in the ass for all that have to maintain them and
| keep them secure.
| Jtsummers wrote:
| You keep saying what they don't want to, but remember that
| _most_ users of Excel are _not_ programmers in the traditional
| sense. I 'd wager that most of them aren't even aware of the
| things you say they don't want.
|
| It's not that you don't need to follow modern software
| practices. It's that they don't know about them to follow them
| or not. Further, Excel is almost pure thought stuff. The
| distance between the user's idea and their implementation is
| almost as small as we can get without investing in a lot of
| educational outreach. And then they end up going crazy with
| macros because they don't know there are other or better ways
| to do it.
|
| Also, "modern" Excel (not sure which version, maybe 2007 or
| 2010?) has largely obviated the need for macros with the
| addition of tables and functions for interacting with them. It
| turns Excel into a kind of relational database permitting
| something close to functional relational programming as
| described in "Out of the Tar Pit" [0].
|
| [0] http://curtclifton.net/papers/MoseleyMarks06a.pdf
| hashkb wrote:
| Just to add on to this - remember the first time you saw
| syntax highlighting? And before that, the code was all in one
| color? You didn't know you needed it before, but you didn't
| go back, did you?
| mst wrote:
| Yes, yes I did. As fast as humanly possible.
|
| Syntax highlighting seems to help most people but I find it
| a horrible impediment to reading code.
| hashkb wrote:
| I think you're in a tiny minority but I'd love to hear
| more about how it impedes you.
| croes wrote:
| I have to maintain and support lots of those Excel
| "solutions" and the users actively block learning any
| software practices.
|
| We are talking about people that still use spaces to right
| align a date in word or create the table of contents by hand.
| The least that they want to be is programmers.
| ianhorn wrote:
| Excel is kind of WYSIWYG programming. I use it for quick stuff
| frequently and I'm amazed at what it makes easier than e.g.
| numpy. There's a whole class of error you don't make because you
| see the whole intermediate state all together (there are also
| whole classes of error you _do_ make that you wouldn't make in
| normal programming).
|
| I have been using it for character sheets in tabletop RPGs I'm
| playing lately, and it's great. With a line of js, you can add an
| arbitrary button to google sheets, and then it turns into a
| quick, dirty UI that's transparent (click on the cell and see
| that AC=10 plus dexterity modifier) and on-the-fly editable by
| everyone together.
| xyzzy21 wrote:
| Excel IS code. It's a dataflow language combined with a
| visual/spatial language. It is hard to migrate or transfer to
| other languages because other language don't have these
| features/architecture.
|
| The other side of this coin is that spreadsheets have NOT BEEN
| IMPROVED significantly since VisiCalc. Excel has some window
| dressing and intentional obfuscation by moving UI elements around
| to make it seem improved but it really isn't at all.
| mcbishop wrote:
| >spreadsheets have NOT BEEN IMPROVED significantly since
| VisiCalc
|
| Not true. Excel added dynamic-array formulas a few years ago
| (where a single formula automatically spills into applicable
| cells below the edited cell) -- game changer. And LAMBDA
| functions are currently in the Excel beta version (create your
| own (recursive) functions directly in Excel) -- another game
| changer.
| amirhirsch wrote:
| I offer my master's thesis "Compiling and optimizing spreadsheets
| for FPGA and multicore execution"
|
| https://dspace.mit.edu/handle/1721.1/45983
| breck wrote:
| I'm really enjoying reading this. Ahead of its time.
|
| I really like your RISC CPU in a spreadsheet in Table 1-4. Have
| you seen others do this since you wrote it?
| mozey wrote:
| Years ago I wrote some VBA that exports all the VBA in an Excel
| file. I ran this script manually from time to time so I could add
| my code to version control. Excel should make it easier to
| separate the code from the data. For the former you probably want
| the entire commit history, for the latter you usually only want
| the current state.
| pessimizer wrote:
| When I was doing a lot of VBA work like this, I used an OSS
| tool that would export all of the code and check it into SVN. I
| can't remember the name of it for the life of me, though, but
| it's probably hiding on a drive somewhere.
|
| edit: This was probably it
| https://www.codeproject.com/Articles/18029/SourceTools-xla
| onychomys wrote:
| I work in a lab where there are lots of excel sheets floating
| around. I went one step further - when I save an xltm (an excel
| template), the code is exported and then a bash script
| automatically uploads it into my git repo. The VBA asks for a
| commit message and then all the rest is automatic. It's worked
| pretty well, all things considered.
| spoonjim wrote:
| Excel makes programming easy because all of the intermediate
| values are visible left to right and the loop iterations are
| visible top to bottom. This makes it easy to iterate towards a
| solution by visual inspection, but also creates spreadsheets as
| buggy as you'd expect if you only tested by visual inspection.
| WorldMaker wrote:
| Sounds like you've never seen advanced Accounting spreadsheets
| because Excel definitely does not have left-to-right/top-to-
| bottom or even intermediate values restrictions. There are some
| amazing Gordian knots people have programmed in Excel.
|
| You haven't really seen the horrors of programming in Excel
| until you've needed to use the "Formula Auditing" group of the
| Formulas tab in the Excel ribbon. Admittedly "Trace Dependents"
| and "Trace Precedents" are still rather more visual tools than
| their source code equivalents, but they are their own sort of
| fun.
| azalemeth wrote:
| I share your pain and have seen people doing everything from
| numerical integration to curve fitting in Excel, all of it
| terribly. I worry what new special versions of recursive hell
| the new lambda functions will unleash upon us.
| Someone wrote:
| That's true for only a subset of spreadsheets. There's no
| requirement for formulas to work left to right and top to
| bottom.
|
| Also, the moment you write _= (A1 + A2) /2_, not all
| intermediate values are visible anymore (although Excel has
| support for temporarily making them visible
| (https://support.microsoft.com/en-us/office/evaluate-a-
| nested...))
|
| Also, in my experience, it's fairly normal to have hidden rows
| or columns (https://support.microsoft.com/en-us/office/hide-or-
| show-rows...) or hide entire sheets
| (https://support.microsoft.com/en-us/office/hide-or-unhide-
| wo...)
|
| And of course, the ultimate "not all intermediate values are
| visible" is the use of macro functions or iterations
| (https://support.microsoft.com/en-us/office/change-formula-
| re...)
| OzCrimson wrote:
| A lot of comments are criticizing Excel users as if we are
| resistant to learning more about other programming languages.
| Resistant as in hard-headed or lazy.
|
| One thing to remember is that the vast majority of Excel users
| aren't fully in IT or tech. We have to deal with data but the
| roles aren't primarily data roles.
|
| - Customer Service Reps
|
| - Admin Assistants
|
| - Warehouse Managers
|
| - Non-profit Fundraisers
|
| - Sales Reps
|
| - Realtors
|
| - Inventory Managers
|
| - Insurance Agents
|
| I've taught at non-profit conferences and saw how people were
| torn. The fundraiser who uses Excel every day has to decide: do I
| spend 4 hours in an Excel session or 4 hours in a session on
| fundraising trends?
|
| ===
|
| So many roles require some kind of data use, and Excel is
| immediately accessible, even if all it is is typing numbers into
| a cell, hand-coloring certain values and getting a sum.
|
| Here's the question: WHEN is a person best served to put in the
| time and effort required to learn Python, JavaScript or another
| formal programming language? WHEN should a Warehouse Manager be
| sent to a Python class? What would that situation look like?
|
| Personally, I hate true programming--and I've done a lot of it.
| But true programming is a whole different mindset. I like the
| visual aspect of Excel. But when I open a code editor and there's
| this wall of letters, numbers, indents, curly-brackets ...
| WOAAAHHHHHHH! No. HELL NO!
|
| Even with WordPress and the templates that are supposedly
| drag-&-drop, I still found myself writing CSS and HTML.
|
| ===
|
| One other thing. Don't forget looking the opposite way. Too many
| coders don't know what Excel can do. I watched a presentation on
| 6 hours of JavaScript that someone wrote to accomplish a task.
| That same task would have taken less than 5 minutes in Excel.
| sokoloff wrote:
| I think a lot _more_ automation /computing should be done in
| these more approachable "citizen programming" tools.
|
| "Job done", "I did it myself", and "I understand how it works"
| are three qualities that are often undervalued when "real
| programmers" look at the work of "citizen programmers". I say
| this as someone who loves and makes a living at "real
| programming".
|
| We need more not less sub-real programming.
| haney wrote:
| I've been tasked with migrating an excel model to a "real
| language" (usually by breaking it apart and re-implementing it
| via a combination of ETL and data warehouse jobs). I've never
| found a great way to run excel in a headless way, so in addition
| to not having version control for it, it's hard to "deploy" it
| when it grows beyond a single person's machine. I wish there was
| more of a gradient between Excel and "real systems".
| munchbunny wrote:
| Personally, I've found that Jupyter notebooks occupy that niche
| pretty well.
|
| When authoring, you have something that shows intermediate
| results just like Excel, making troubleshooting without
| dedicated debugging still pretty doable. And then you can still
| run them headless, and you can check them into version control,
| and diffs are readable enough.
| hugi wrote:
| A few years back, I was tasked with a similar thing. A
| government ministry was creating a complex calculator for a
| (very anticipated) public project that was supposed to go on
| their website. We started out using pure JS but the
| mathematician working on the project kept giving us new Excel
| documents with extremely heavy changes to the algorithms.
|
| In the end I gave him a location where he could upload the
| document and told him to just make sure inputs and outputs were
| always in the same predefined cells. Then we used Java and
| Apache POI to load the Excel document and run the actual
| calculations on the website. Best decision ever.
| carpo wrote:
| I did something similar for a few clients, but mainly for
| automating documents. People were copying and pasting between
| Excel and Word, so I made them systems that link the two
| together. I had enough clients asking for something similar
| that I made a SaaS product that does it. Gives them a nice
| little interface that links web form fields to named ranges,
| and then a simple templating language to insert those fields
| into a Word document. Instead of writing a calculation engine
| for our webforms, I just used Excel. It's pretty powerful,
| and more than I could have implemented if starting from
| scratch.
| antris wrote:
| _> In the end I gave him a location where he could upload the
| document and told him to just make sure inputs and outputs
| were always in the same predefined cells. Then we used Java
| and Apache POI to load the Excel document and run the actual
| calculations on the website. Best decision ever._
|
| This is the kind of simple and effective solution that
| programmers who think they know everything would scoff at.
|
| Love it.
| tehbeard wrote:
| It's simple and effective until a part of the rube goldberg
| machine breaks...
| [deleted]
| montecarl wrote:
| I have used the google sheets API to implement something
| similar when working with a nonprofit. They needed a fairly
| complex listing on their website that needed
| search/sort/filter/mapping and needed to update this list
| regularly. So I just took their existing google sheets
| document, and accessed it as a read-only database in the
| browser using Google's REST api and it was fairly painless!
| If they ever broke anything I could easily go into the
| spreadsheet and fix it. This approach really reduced the
| effort needed. If I had to write a "proper" interface for
| them to enter and update their data I wouldn't have had
| time to work on their project.
| WorldMaker wrote:
| The Microsoft Graph APIs in Microsoft 365/Office 365 give you
| pretty much all of the Excel execution engine as REST
| endpoints "in the Cloud" if you just store your Excel files
| in SharePoint.
|
| It's not surprising the number of turducken business
| applications being built exactly this way. With Named Cells
| you don't even have to hard-code cell numbers, just tell them
| to name them specific things, and Excel users are very happy
| with the amount of flexibility to rewrite the spreadsheets at
| will.
|
| It's not necessarily the sanest approach to building
| software, but no one ever accused most enterprise software
| development of being sane.
| pjmorris wrote:
| > turducken business applications
|
| Great analogy
| OskarS wrote:
| As a rapid prototyping tool, it doesn't sound terrible,
| honestly. Many people are comfortable with Excel, so let
| them use it! You're gonna use some calculation engine on
| the backend, might as well be the tool that contains the
| "reference" calculations.
| ebiester wrote:
| Depending on budget, it might be less expensive to look at a
| tool like https://app.molnify.com/#ajax/examples (or its 5
| competitors from a google search.)
|
| It feels like a subset of this should be an open source app
| (that is, turn an excel spreadsheet into a C# app) for anyone
| looking for an idea.
| TTPrograms wrote:
| In the past I wrote a simple formula evaluator in Python I used
| to replicate some multicell calculation - the spreadsheet I had
| took the form of mostly simple algebra being performed in a
| scanning pattern against various small windows in time (rows)
| from a set of columns. I just extracted the cell formula
| definitions and transformed them.
|
| It may not be that hard to replicate the set of formulas you
| need to get 90%+ of your excel model.
|
| If someone implements a 90% reimplementation of Excel in Python
| that would be a really useful library for stuff like this. You
| could do some neat stuff with dependency identification too.
| makapuf wrote:
| There is https://pyspread.gitlab.io/, not sur if it fits your
| use case?
| elliekelly wrote:
| Have you tried AirTable & their API?
| haney wrote:
| I haven't tried it, it does look really interesting, although
| most of the time the problem is that the finance/ops/etc.
| team already had something really complicated in Excel and
| the question is "what should stay in excel, and what should
| be reimplemented in some other system".
| stonemetal12 wrote:
| Isn't that the only reason access exists? Import from excel and
| build forms on top.
| prionassembly wrote:
| There are a few. I use xlwings for https://github.com/asemic-
| horizon/stanton , which is some bits of code to specify expert-
| led sensitivity analysis from Excel _and_ use the results to
| emulate the spreadsheet from a ML model.
| LukeEF wrote:
| That's github based for collaboration I think. The one
| mentioned in the post VersionXL [1] is based on the cloud
| version of TerminusDB (co-founder here), which is an open
| source revision control database. It uses delta encoding for
| updates, but is a proper DB optimized for the task. You get
| transaction processing and updates to an immutable database
| with version control features: branch, merge, rollback,
| searchable diffs, and time-travel. It also ships with a
| mature python client to allow you to manipulate the Excel
| data.
|
| [1] https://versionxl.com/ [2]
| https://github.com/terminusdb/terminusdb
| mcdonje wrote:
| Yeah, you either build a pipeline that generates/updates excels
| that get emailed or self-service downloaded, or you teach them
| how to use powerquery to get the data from the enterprise db.
| surfingdino wrote:
| I used to work with someone who refused to learn another
| programming language besides VBA in Excel. He slowed everyone
| down and it got to the point where he had implemented a JSON
| parser and generator in Excel 97. Badly. It's one of the worst
| experiences of my professional life. I dislike VBA because it
| convinces those who learn it that it is a programming language
| and that Excel is a programming environment just like Python or
| another popular programming language with their standard
| libraries. That's just not the case, but its very hard to
| convince business people who have spent their whole professional
| life using MS Office that there are better choices for building
| their business apps than MS Office and VBA. Just let Excel and
| VBA die.
| MeinBlutIstBlau wrote:
| I worked in a bank where we still used paper because the lead
| supervisor told us we had to. I'm not talking like paper that
| was needed so we just kept it in the file, I'm talking "Print
| out the entire loan profile in paper simply because the
| supervisor refused to learn how to do use a computer" paper.
| We're talking thousands of pages a day for ONE LOAN! All
| because this woman was lapsed by technology and HR had no clue
| she was so out of touch.
| LukeEF wrote:
| Lots of SaaS services, like Google sheets, go the quick and dirty
| route: one central database and the UI displays a view which you
| all work on together. That's not collaboration imho - and no dev
| shop would accept that as a reasonable way to work (lets work on
| the code in a google doc!).
| CivBase wrote:
| > Unfortunately, none of this applies to Excel because Excel
| doesn't work well with revision control. Why? Because Excel is
| not a source file. It is a database coupled with code. [...] The
| path to enlightement is a more sophisticated revision control
| systems - ones that can understand Excel.
|
| This is where the author lost me. The "path to enlightenment" is
| not to build new VCS software. The solution is simply to stop
| coupling your database with your code. Embrace the Unix
| philosophy and stop perpetuating monolithic software.
|
| Excel is a spreadsheet editor. It was never designed to be a
| database. It can act as a quick-and-dirty database with minimal
| setup and training required. Sometimes that's all you need and
| Excel is a fine tool for those situations. But it has
| limitations.
|
| Stop trying to force Excel as the solution to all your problems
| and don't be afraid to learn a new tool once in a while.
| croes wrote:
| "virtually nobody treats Excel seriously like a programming
| language."
|
| Because Excel was not Turing complete until recently.
| WorldMaker wrote:
| VBA was not recent. Also, you'd be amazed by Turing Complete
| things like what someone determined can do with just VLOOKUP().
| That's even before you get into truly abstract models of things
| proven Turing Complete such as Rule 110 of Cellular Automata
| and how easy/hard you can implement them in Excel without VBA
| Macros or "advanced functions".
| thefifthsetpin wrote:
| Can't you just implement a turing machine in Excel by using the
| cells in a row as your tape?
|
| * Store the initial internal state in A1.
|
| * Store the initial head position B1.
|
| * Store the initial state as boolean values in the rest of row
| 1.
|
| * Write simple lookup formulas in row 2 to compute the next
| state from the previous row.
|
| * Fill down. Look for the halting state in column A and your
| output will be written in that row.
|
| What am I missing?
| john_alan wrote:
| Yep Excel is great, actually working on a Minix like Kernel in
| it.
| behnamoh wrote:
| It's sad that after nearly 50 years, the way we write programs
| has not changed. We still use keyboards and write code one line
| at a time. Sure, there are auto-complete extensions and helpers,
| but the basic idea is still the same: write your instructions for
| the computer to perform them.
|
| When it comes to making programming approachable for the masses,
| it's actually kinda funny to think that Excel (and spreadsheets
| in general) have been way ahead of traditional programming
| software.
|
| I hoped that new tech (AR/VR/etc) would help shift the focus from
| "typing" programs to "drawing" programs. But efforts to visualize
| programming only remain at the conceptual level and never gained
| traction.
|
| It's hard to imagine 100 years from now we will still be typing
| code.
| analog31 wrote:
| Creating complex things using drawing tools is physically
| laborious, and suffers from readability problems when things
| get too complicated to fit on one screen. I've seen this with
| mechanical and electrical CAD.
| piyh wrote:
| Typing speed is not my bottleneck for generating code, it's
| renaming variables and rewriting it 5 times until it's no
| longer a mess.
|
| A Vulcan mind meld would be nice but lacks precision.
| surfingdino wrote:
| Musicians have been happy with their simple keyboards for
| hundreds of years. Why wouldn't software developers be using
| theirs in a hundred years?
| greenreptar wrote:
| Surprised nobody has mentioned this. There is a company called
| Boardwalktech with a tool called "Excel Cloud" which adds a
| native extension into Excel which includes a change log and (i
| think) realtime collaboration, among other things.
|
| They call their underlying tool a "digital ledger" which sounds
| very blockchain-y, but it's not a distributed public ledger so
| there's no crypto here, just a centralized, Boardwalktech
| controlled ledger.
|
| https://www.boardwalktech.com/products/boardwalk-excel-cloud
|
| They're already integrated with some very big companies like
| Accenture, Ernst and Young, Coca-Cola, Mars, Facebook, etc etc.
|
| Personally, I can't imagine company leaders really investing tens
| to hundreds of thousands of dollars leaving their processes in
| Excel and not instead buying a real system, but I'm not running
| all of the companies mentioned above.
| kyberias wrote:
| > Because Excel is not a source file.
|
| Well, it is a zip-archive with XML files, so it's close.
| banana_giraffe wrote:
| Notably: The VBA stuff is stored as a binary OLE2 blob thing
| inside of the xlsm file.
|
| (Or at least it is in the few spreadsheets I checked, no clue
| if there's some way to change that behavior)
| cxr wrote:
| I recommended exploring this approach here
| <https://news.ycombinator.com/item?id=27998733>:
|
| > _Hot tip for handling office file formats or anything that
| uses a ZIP container: just unzip them and commit _that_ to the
| repo._
|
| Even modern (zipped XML-based) office file formats do make some
| limited use of binary blobs. You can either keep these intact,
| or write a small objdump-like tool that serializes them to
| text+. For portability, it might be best to write the
| serializer/deserializer in JS dumped into a thin HTML wrapper,
| so you pretty much anyone can double click to "run" it. (My
| experiments on roundtrippability with including _that_ file in
| the ZIP container yielded poor results.)
|
| + I've used this strategy for Oberon .rsc binaries. Due to
| Wirth's affinity for single-pass compilers, the Oberon
| toolchain doesn't involve a discrete assembler or AOT linker
| tool, so there is no assembly format or linker scripts.
| However, Wirth's distribution of the Oberon system does have an
| ORTool utility <https://people.inf.ethz.ch/wirth/ProjectOberon/
| Sources/ORToo...> (in the vein of objdump/readelf/nm) that will
| dump a textual description of the binary you give it. I
| realized that with some slight tweaks, you can use the output
| of ORTool.DecObj as a de facto "assembly" format--just write a
| tool capable of parsing it and then write out the corresponding
| binary.
| inshadows wrote:
| >> Hot tip for handling office file formats or anything that
| uses a ZIP container: just unzip them and commit _that_ to
| the repo.
|
| What is the point if that? I think neither binary nor XML
| output would be meaningful in the diff output.
| WorldMaker wrote:
| I built a tool to explore version control of files like that by
| decompressing their contents and version controlling those. It
| was an interesting experiment.
| PicassoCTs wrote:
| Excel as code is a main spreading vector for bad practices like
| copy & paste, monolithic procedural monsters, bad databases with
| duplicate entries and so forth.
|
| The reason why management cant perceive code-quality, is because
| there main tool, does not allow for good code-quality. In fact it
| does not even allow for abstractions..
|
| If you ever wondered, why management does not blink and recoil
| one description of coding horrors..
| bob1029 wrote:
| Only for a lack of imagination would you fail to perfectly model
| your target problem domain in terms of tables & columns... You
| would have a fucking monster of a time trying to describe to me a
| practical problem that I could not hypothetically wrangle &
| demonstrate with Excel. Just think about it. You can model a _ray
| tracer_ in Excel if you have the patience for it.
|
| The magic of Excel is that it runs everywhere and is very
| intuitive to work with. I honestly can't recall any users who
| were simply unable to function in a basic read-only way with
| Excel. Iterating complex problem domains in excel workbooks is a
| low-friction way to collaborate with your business stakeholders.
|
| Once you get it nice in Excel, the next steps are compelling.
| Using an obvious 1:1 mapping between Excel worksheets and SQL
| tables, you trivially move all data items into a realm to be
| easily queried using a declarative, domain-specific language. You
| can also sprinkle in views and user-defined functions for maximum
| happiness on the business-side of the house.
|
| The richer and better-normalized the relational model, the better
| your SQL interface will be. If you ignore the performance
| equation for just a few seconds, you might see the blinding
| luminosity of cleanliness that emerges from normal forms beyond
| the 3rd one. We are going to investigate a variation on 6NF for
| the next major version of our product.
|
| I will conclude my rant by saying that there is no logical
| determination/interpolation/projection of facts which is
| unachievable in an ideal SQL representation. It is _very_ easy to
| teach SQL to non-wizards by way of the mighty example. Excel is
| the most important starting point on this journey, because it
| defines the common language and relations that you and the
| business will use to refer to all of the things.
| jagged-chisel wrote:
| > Git was not built for this - ...
|
| But it does have a sort of plugin system to support other
| formats, right? Does an Excel format lend itself to being
| supported in this way?
| wcerfgba wrote:
| You can still use 'straight Git', maybe with some PR management
| system like GitHub/GitLab/... . The difference is you can't
| rely on the diff to be useful, instead you'd need to provide a
| good commit message summarising the changes (which you should
| do anyway!) and then reviewers will need to check out the
| relevant version to poke it directly in Excel.
|
| But I agree that being able to manage an XLS(X) as plain text
| and having a proper diff would be incredibly useful. :)
| da_chicken wrote:
| I highly doubt you'd ever get diff for XLS in a general or
| universal case. That format is so old and crusty that it's
| only really defined by what Excel will do with it.
|
| XLSX, on the other hand, at least has to follow XML
| conventions and basic ZIP file structure, even if the open
| specification for the XML is really now a strict subset of
| what the current version of Excel will accept.
| marklit wrote:
| Git supports extension-specific overrides which enables things
| like textual comparison of Office files.
| https://tech.marksblogg.com/git-track-changes-in-media-offic...
| WorldMaker wrote:
| I explored storing file types like XLSX as the deconstruction
| of their zip file into individual XML/etc files. In my cases my
| focus was DOCX rather than XLSX, and I originally targeted a
| different VCS than git so I built it as precommit/postmerge
| hooks rather than git's diff hooks/attributes plugins. I got
| some interesting results with my tool and it wasn't a bad
| experience. Just not one I could suggest to novice users
| (fixing XML in a merge conflict is not entirely fun and very
| different from say Word's own review tools designed for higher
| level merge fixing).
| theonlybutlet wrote:
| It mind-boggles me that microsoft are not investing in VBA more,
| its userbase is massive. Sure its old and has its problems but
| I'm sure continuing to develop it alongside more modern solutions
| would help them rather than hinder their efforts. Make it more
| similar to other things out there and eventually people will
| change over.
| wvenable wrote:
| Microsoft borked it with the migration to .NET. Instead of
| making VB.NET 100% compatible with VBA they created a
| unnecessary C# clone with a VB skin. That decision ended VB as
| a viable product and any migration path for VBA in Office.
|
| If they had made VB.NET fully compatible, then we'd all just
| have the CLR in Office and we could be using any number of
| languages to write Office integrated software.
| jayd16 wrote:
| So this is an ad for some new merge tool I suppose.
|
| Is there a solid open source tool for merging Excel files? Or
| CSVs or SQLite files for that matter?
|
| I think this is probably best seen as a shortcoming of our
| current general VCS. At the moment we're stuck with newlines as
| the main means of merge semantics. That really restricts what we
| can put in VCS. Even with custom merge tools, its quite
| cumbersome as git does not allow this to be preconfigured.
| jacobdi wrote:
| I think this is spot on. I agree that Excel users want to stick
| with Excel, but they do run into major issues that are solved by
| code. Namely: their data size is too large, Excel is too slow,
| and they struggle to get repeatability from their work.
|
| I am building Mito[1], a spreadsheet interface for Python. Every
| edit you make in the spreadsheet generates the equivalent Python.
| It is a bridge between the workflows of Excel users and Python
| users, and allows Excel users to reap Python's benefits without
| needing to know how to code.
|
| [1] https://docs.trymito.io/
| Zababa wrote:
| > People refuse to stop using Excel because it empowers them and
| they simply don't want to be disempowered.
|
| That is not always true in my experience. Many people use Excel
| because it's one of the two programming tools allowed by the IT
| department, the other being a web browser. Even if you manage to
| install Python or something (good luck getting the package
| management working from behind your corporate proxy), your
| collegue will not have it, so it's useless. And distributing
| executables is usually not tolerated either. So you use excel,
| and share Excel files.
|
| I'll add that another big problem I have with Excel is usually
| the lack of database support. Moving data around by copy/pasting
| it in Excel with macros is a pain, and IT didn't allow Microsoft
| Access either so I can't comment on that. But I think it would
| have made my life easier.
| JohnnyHerz wrote:
| If you substitute "spreadsheet" for "Excel" than i agree. But i
| have tried everything possible to avoid Excel as every
| iteration just brings new problems instead of fixes. I used
| Clarisworks years after it was EOL and now am trying very hard
| to convert to Libreworks. Admittedly i can't completely escape
| Excel yet, but i am hopeful and it's getting to the point where
| the bugs in Libreworks are no worse the bugs in Excel. If not
| for all the Legacy Excel sheets i have, i'd be off it
| completely.
| dan-robertson wrote:
| I somewhat disagree. I work with a lot of excel power users. We
| have some massive spreadsheets which are collaboratively worked
| on and do complicated things. The first thing to say about them
| is that they are very valuable to the business so it is
| important to be able to do some of the things excel does.
|
| Excel has a lot of advantages compared to regular programming:
|
| - It is quick to change. The programs I work on take nearly an
| hour to go from code review completion to production, even with
| manual poking to speed up continuous deployment. It can be
| valuable to be able to change things quickly.
|
| - In excel the main thing you interact with is the data. If you
| are a domain expert then you should be able to look at outputs
| and see if they seem right. When you change a formula or add a
| column, you are, in some sense, also getting to run it on
| realistic data instead of needing to try to construct realistic
| tests.
|
| - There isn't much difference between config parameters and
| hard coded values. In the programming language I use, you can't
| really have globally readable configs so any new parameter must
| be threaded through from app startup to the place you want to
| use it, discouraging configuration parameters. Which means it
| is often slow to change something that ought to have been
| configurable. In excel you can make a quick cell for some
| Config parameter (changing a lot of formulas is not so fun
| though.)
|
| - Functional and declarative, Excel tends to give you
| internally consistent output. There is less need to worry about
| incorrect state updates.
|
| - Its maybe better for producing graphs. I never really liked
| making graphs in excel and I thought the defaults were bad for
| good data visualisation but then other systems have bad
| defaults (when I draw a graph I often use GNU Calc with
| gnuplot...)
|
| - Pivot tables are great for ad-hoc analysis (indeed Excel is
| pretty good for as-hoc analysis in general.) The pivoting
| operation is trivial in excel and a big pain to with tools like
| grep or awk or sed.
|
| These Excel users are generally capable of programming too and
| may use jupyter notebooks with python or R, or something more
| fully featured when required. And some things will get
| outsourced to software engineers, but excel is still clearly
| useful (so long as it scales) and people don't just use it
| because they are desperate for some kind of 'real' programming
| language.
| rmbeard wrote:
| No-one is going to pay to version control Excel.
| Jtsummers wrote:
| Actually, people do. But it's not terribly fine-grained.
| SharePoint offers version control of MS Office documents and is
| used in many businesses as an improvement over shared drives
| and files named:
| Foo_Report_v3_FINAL_20210928_FINAL_DRAFT_FINAL.xlsx
|
| I don't think you get branching with SharePoint, though.
| xupybd wrote:
| I'm about to pitch this to my manager. We have automated
| manufacturing going through Excel. It allows the domain experts
| to tweak the manufacturing process without having to learn to
| code.
|
| The price is going to make this a difficult sell. If it was one
| off at $1000 easy but monthly per user...
| aarreedd wrote:
| There is dolthub.com which is Git for data. But there is only an
| SQL interface. No way to source control the style of the data in
| Excel.
|
| Last time I looked into Dolt there were no commit hooks either.
| That would let you add linting or other data validation.
| fzumstein wrote:
| At https://www.xltrail.com, we wrote an open-source Git extension
| that allows you to diff the VBA part of your Excel workbooks. The
| extension also integrates with SourceTree, Atlassian's free Git
| client. You can see some screenshots on my blog post:
| https://dev.to/fzumstein/how-to-diff-excel-vba-code-in-sourc...
___________________________________________________________________
(page generated 2021-09-20 23:00 UTC)