[HN Gopher] Building a data team at a mid-stage startup
___________________________________________________________________
Building a data team at a mid-stage startup
Author : squarecog
Score : 543 points
Date : 2021-07-08 21:04 UTC (1 days ago)
(HTM) web link (erikbern.com)
(TXT) w3m dump (erikbern.com)
| gumby wrote:
| Great article. The confusion about what team does what is
| priceless...yet so common!
|
| To provide some sympathy for the folks already working there: you
| always replace systems well _after_ you 've overrun them.
|
| When the ad hoc system works (consider that google spreadsheet at
| a time when there were three support people and perhaps a dozen
| customers) you're not going to decide to replace it with
| something more complicated. Then you're busy growing so you just
| keep the system going through sheer force of will. You only
| replace it when the effort is unbearable; at that point you say,
| frustratedly, "I wish we'd done this sooner."
| cobertos wrote:
| Part of me wonders what the long term of a transition like this
| looks like. Would this company be able to keep its data
| consumption healthy, or would it drive product changes that might
| harm it's users or lead to dark patterns?
| civilized wrote:
| Wow, a story where things start out a mess and end up a lot
| better! Can we write one of these for society too?
| roystonvassey wrote:
| This is a perfect encapsulation of my career as a data-guy square
| peg in a round hole, filled with jargon and misplaced
| understanding of data in general.
|
| Despite all that you read and hear about data science advancing,
| you'll be surprised to see how poorly leveraged, or worse,
| billions of dollars are sought to implement the latest tool that
| promises to change the world. Tech and data as we imagine it be
| in the FAANG kind of companies is far different than how it is in
| older industries. It's not just systems that need upgrading,
| company cultures do and that's never an easy or fast process.
| I've been in the data Analytics space for 16 years now and I
| still feel, more often than not, I'm part of the minority,
| working to demonstrate true data use-cases
| jabagonuts wrote:
| Really enjoyed this narrative, but what about the next phase?
| Going from mid-stage to mature startup?
|
| > Note that you took on a lot of "tech debt" earlier when you
| started dumping the production database tables straight into the
| data warehouse.
|
| How do you manage expectations when the year-long honeymoon is
| over, the business grows tremendously, and the centralized data
| warehouse reaches a breaking point?
| neighbour wrote:
| Also thought this. Let's hope the author has a SQL in the works
| as I am keen to hear more.
| [deleted]
| spicyramen wrote:
| Can correlate, author is a truly a genius. We had a company
| mandate to be ML first, we went through a lot of phases and so
| many conversations happened as described in this amazing piece.
| Thanks Erik
| simonw wrote:
| "This is basically a (somewhat cynical) depiction of things that
| may happen at a lot of companies early in the data maturity
| stage"
|
| I don't think this is very cynical at all! Feels pretty accurate
| to me.
| IMTDb wrote:
| What would be the name of the position/profile of someone in
| charge of building the data warehousing architecture/ETL
| pipelines?
|
| I my view, they need make sure the warehouse model is a correct
| representation of the business and that it can be leveraged to
| answer basic or not-so-basic questions using SQL. They also need
| to promote it's usage internally by ensuring it is accessible and
| easy to use and guide other team to a more data oriented mindset.
|
| I feel that this is a specialised position not exactly similar to
| a developer, but every time I look for "data scientist" I get
| guys that want to do machine learning prediction models, which is
| not exactly the same stuff either.
| edmundsauto wrote:
| This is what data engineers do, although that is also used to
| describe data ops (maintaining clusters, running kafka, etc.)
| herodoturtle wrote:
| You pretty much described my job in a nutshell, and they call
| me "the database guy".
| sischoel wrote:
| What about "data engineer"? There seem to be a lot of jobs for
| that title nowadays.
| skrtskrt wrote:
| Yeah we would call this Data Engineer (likely Senior level or
| up for someone that has had experience building multiple data
| warehouses) plus the DevOps/SRE work required to stitch all
| the architecture together
| sjg007 wrote:
| The bigger issue is adaptability.. can you migrate schemas
| preserving older clients, typically that's by providing a
| decent middleware.... SQL views are one way, APIs are another
| etc...
|
| All of that while improving performance.
| teej wrote:
| A new role has arisen in the last few years that captures much
| of this responsibility - Analytics Engineer.
|
| This article by Claire Carroll describes the role and
| motivation for it https://www.getdbt.com/what-is-analytics-
| engineering/
| tmp_anon_22 wrote:
| Most common would be a DevOps or SRE on an observability team.
| pram wrote:
| I've done this for the past 6 years and my title was "Big Data
| Infrastructure Engineer" but I don't think there's any
| consistency at companies from what I've seen
| Orou wrote:
| I would also vote for "data engineer" (it's my current job
| title).
|
| You very likely don't want a data scientist to be doing a data
| engineer's job (and they probably don't want to be doing it
| themselves!). While there are similarities, data engineering
| tends to be a lot closer to software development than data
| science. If you're advertising for a data scientist role, don't
| expect them to be happy if 80% of their job is writing ETL
| scripts and cleaning datasets.
|
| I think the reason there has been a flattening in data
| scientist job growth more recently is that lots of companies
| hired data scientists to build cool ML applications but had no
| infrastructure in place to support advanced data analysis.
| These companies didn't realize they needed to walk before they
| could run, and that what they really wanted was data analysts
| and engineers to build the foundation for a strong data science
| function.
|
| Tools like dbt have been great for advancing an ELT approach to
| managing data pipelines, where modeling for BI tools, business
| users, and data scientists alike can all happen in the
| warehouse and ensure consistency in data usage across the
| company.
| dijksterhuis wrote:
| Seconded.
|
| I was a bit sad to not see any mention of a data engineer
| anywhere in the article.
|
| Like, if you gave me access to all the prod tables and the
| warehouse I'd be having a whale of a time and (hopefully)
| delivering enough business value to automate some of the more
| regular "English to SQL" translations.
|
| > You very likely don't want a data scientist to be doing a
| data engineer's job.
|
| 100%. This is one of those things that would make
| "disgruntled ML people" in the article want to leave.
| ramraj07 wrote:
| The one issue is that the gamut of experience and ability in
| a data engineer (and the salaries) is extremely wide, far
| wider than I've seen for any other role. Hiring a good DE is
| so hard!
| sails wrote:
| IMO data engineer roles are further subset into:
|
| 1. kafka / streaming oriented software engineering
|
| 2. data warehouse and ETL/ELT development for analytics
| dijksterhuis wrote:
| A good data engineer understands and can work with both of
| these.
|
| They're both "data in, data out" mental models that are
| part of the Lambda architecture which every data engineer
| should at least know about [0].
|
| But if you want a specialist streaming person to optimise
| all the streaming pipelines, then sure hire a specialist.
|
| [0]: https://en.m.wikipedia.org/wiki/Lambda_architecture
| rickeydidio wrote:
| This is spot on. As someone who has been looking for a data
| analyst role, I've actually read quite a few DS reqs that
| were geared more towards infrastructure and ETL. Then the
| flip side with the DE reqs wanting NumPy and Pandas along
| with the infrastructure and ETL. Weird, right?
| hobs wrote:
| I currently do that job as a Data Architect - kind of a
| mouthful lol but it covers the gamut of understanding the
| entire business as an abstract set of data flows, being
| responsible for the ingest and outflows of data, the level of
| quality in our overarching system, managing data engineers,
| developers, business folks all accessing said data, at the end
| of the day explaining what it all means to our clients and devs
| via standard modeling stuff and more targeted things as needed.
| edmundsauto wrote:
| You mention that you manage data engineers. Where does your
| role not overlap w/ a data eng?
| hobs wrote:
| In our team its mostly a difference of business focus and
| the overarching responsibility - most data engineers I work
| with manage a major leg of the business and are responsible
| for their domain but I am responsible for all of them.
|
| I certainly spend time coding (especially because again,
| small-medium startups cant afford anyone in the data space
| who isnt able to heave ho) but much of it is translating
| pretty vague stuff into market research/a proof of
| concept/an initial design of what will bring value to the
| business and scale alright and then often more people will
| throw in.
|
| That being said you can call me whatever you want, as long
| as its not late for dinner :)
| mjirv wrote:
| Analytics Engineer is a clear one for this, as teej said.
|
| The title is strongly associated with the dbt community, so it
| could imply you're using dbt for your data modeling (not
| necessarily a bad thing, as it sounds like it would be a good
| tool for your use case).
| marcinzm wrote:
| You're mixing up two different tasks as I see it:
|
| * Building/defining the data infrastructure
|
| * Building/defining the schemas
|
| In a traditional ETL infrastructure they are jumbled together
| but if you do ELT they are not. A data engineer can build the
| infrastructure but the transformations can be handled better by
| technical analysts. They're simply one view on the underlying
| data so the risk is minimal. Analysts query the data day in and
| day out so they know much better what they need than someone
| who doesn't.
| czep wrote:
| This is so eerily familiar I swear I've had many of these exact
| conversations word for word. The only way this doesn't turn into
| a complete nightmare of a cluster is if the exec team "gets it".
| If so, you just might stand a chance at building a data team that
| gels with the rest of the org.
|
| But if the exec team simply hired you for window-dressing, expect
| to be treated like a scapegoat and a punching bag. Any mistakes
| will be your fault. Any wins will be to the credit of the
| business. The Director of Product will ask to "embed" dedicated
| DS headcount and you won't have any real power to shape the
| roadmap. If the exec team doesn't give you equal footingf with
| Product (or Marketing, Finance, and Eng for that matter) then
| this will rapidly become a soul-sucking job. However, if E-team
| does give you the authority to call Product's bullshit, and tell
| Finance to stuff it, and not take direction from Eng leads, then
| you actually might be able to accomplish something really cool.
| WastingMyTime89 wrote:
| > However, if E-team does give you the authority to call
| Product's bullshit, and tell Finance to stuff it, and not take
| direction from Eng leads, then you actually might be able to
| accomplish something really cool.
|
| So what's the business case for having a data team independent
| of product, business and engineering?
|
| Because as I see it the data team is a support function not q
| core part of the business. I'm sure it can be cool for you but
| if you are at odd with all the people actually creating value,
| what exactly do you bring to the table?
| higeorge13 wrote:
| Engineering is building some schema, creates and uses
| multiple data stores , message queues, etc, eventually the
| queries do not longer work properly as the company scales and
| gets more and larger customers and hundreds of other issues.
| Doesn't engineering need a proper data engineering
| team/dba/you name it to handle those?
| marcinzm wrote:
| In my experience much of this is a question of trust, political
| capital and soft power. Find out the problems that the key
| players in the business are actually having that you can solve
| and then solve them. Find out what the key KPIs are for the
| business and make a plan to improve them and then have a plan
| to publicize that improvement. And make sure to hire a team
| that covers your weaknesses rather than exposes them. Don't
| fight people if you can help it, either they're as competent as
| you on average or you shouldn't have taken the job. Figure out
| how to help them and what they need to work more efficiently
| and then give it to them. Sure there's a ton of politics
| involved in all of that but that's management in general.
| nwsm wrote:
| This was my only complaint about this great article. The CEO
| was innately "data-driven" which opened a lot of doors.
|
| OTOH, if the execs don't have this priority, no one gets hired
| to lead and scale a data team and the story never starts.
| PragmaticPulp wrote:
| This applies to most specialties. Companies tend to have a few
| teams that lead the charge and expect everyone else to follow.
| Knowing which teams get the authority and which teams are along
| for the ride at a company is important for knowing what your
| job experience will look like.
|
| > However, if E-team does give you the authority to call
| Product's bullshit, and tell Finance to stuff it, and not take
| direction from Eng leads
|
| I know this was meant partially in jest, but if you reach the
| point where you're at odds with all of the teams and
| departments in the company you may get a lot done in the short
| term, but long term it's going to be difficult if you don't
| have some allies in each of those departments. Obviously no one
| should roll over and take orders from other departments, but
| some times it's necessary to do some give and take to build
| rapport. It's a balance, not a war.
| czep wrote:
| Thanks for the tips! One mantra I've tried when starting at a
| new job is "for the first 3 months say yes to everything, for
| the next 3 months say no to everything." The idea is you
| first immerse yourself in everything, to find out what works
| and what doesn't. Then you dedicate time to fix the broken
| processes so that hopefully when you hit 6 months your team
| is better positioned to be more efficient. Obviously you
| can't be too rigid, but it seemed to work for me when I had
| buy in. Curious if you think that approach sounds good.
| PragmaticPulp wrote:
| Good advice as long as you don't take it too literally.
|
| The most important thing is to work closely with your
| manager on expectations. If someone from another department
| comes to you with a proposal, an ask, or a directive, you
| don't want to say yes without first consulting with your
| manager. Depending on company politics, some managers might
| try to rope new employees into doing work that isn't
| actually part of their job description.
|
| Discovering expectations and then proactively managing
| those expectations is key in any role.
| tharkun__ wrote:
| Very good advice. I've also seen this from new ICs
| (incidentally from one of our new data guys). I bet he
| said yes but he shouldn't have.
|
| New guy, knows nothing about the company and product yet
| but was asked to "get KPI X by end of day". He obviously
| has no idea how to get this done so goes to various
| people and throws around the "VP XYZ wants this by end of
| day, help me now or else!".
|
| Needless to say I, as politely as I could, told him to
| shut it, look at his data and what he could get from it
| and stop interrupting dev with mid day, two days after
| start of a sprint, requests to do his work for him (dude
| I don't even have access to your data storage, don't know
| what data you have or don't etc). And do it by end of
| day. Sure.
|
| The guy is burned for me now. He will have to do a LOT of
| sucking up to dev now for his try at "do my job for me or
| else"
| ttz wrote:
| > MBA types
|
| I chuckled. Then cried, because at least his MBA types can use
| SQL. My MBA types use Excel.
|
| OT: Good article. Like and agree with the push for centralizing
| data first, then building outwards so external teams can move
| towards self-service.
| herodoturtle wrote:
| I'm an MBA type that studied math and computer science, and for
| a living programs distributed database solutions.
|
| I chuckled too.
| munk-a wrote:
| Building a good process into your company to receive a query,
| execute it against a read-only database, and shovel the results
| back to the user as a CSV file will pay dividends and is,
| honestly, pretty trivial in most cases.
| ttz wrote:
| Funnily enough, this is what I did, except I built an app
| where I write the queries as "pre-built" parameterized ones
| (sanitized, of course).
|
| People still do a bunch of stuff in Excel, though, and every
| once in a while, it breaks, and I have to dig through the
| mess. Excel is great when it's just for yourself and you can
| manage it... it's a pain when others have to figure out
| someone else's.
| jaggederest wrote:
| Blazer is my go-to for this kind of thing:
|
| https://github.com/ankane/blazer
|
| Pretty easy to set up and share queries, dashboards, whatever
| herodoturtle wrote:
| For the last 15 years I've been building (what I consider to be)
| accessible database solutions, for a bunch of different
| industries.
|
| This sentence from the article resonated with me:
|
| > You're starting to lay the most basic foundation of what is
| most critically needed: all the important data, in the same
| place, easily queryable.
| Artgor wrote:
| When I had started reading this article, I had thought that it
| would be a sad story about another startup failure. The blogpost
| turned out to be a fascinating story of the success. I really
| liked it.
|
| But after I had finished reading it, I have realized that it is a
| sad story, if we look from the eyes of data scientists in the
| team. People were hired to do cool machine learning projects, but
| it turned out there is no infrastructure for them. After the new
| boss had arrived, they had to work as analysts for months. What
| is more sad - the new boss dangled a carrot before them several
| times, but each time the carrot disappeared.
| AtNightWeCode wrote:
| I really enjoyed reading this. Very well written. At companies I
| worked teams can never read data from the DW btw.
|
| My experience with A/B tests is that they are way overrated.
|
| On the poor data quality. You sit on a product like a call
| center. Frontend developers thinks it is an excellent idea to
| store all data in some doc db blob. Then business wants stats
| about number of calls based on users...
|
| Be careful when putting tabular data into doc dbs.
| tsrez wrote:
| It's such an interesting and valuable article on building a data
| team, esp. insightful for organisation starting out. Guess the
| challenges in traditional/larger companies starting out a data
| team might look slightly different.
| correlator wrote:
| Thank you for writing this. I personally just walked into a very
| similar role and this rang really true. This article made me
| realize how much more effort I need to put into the data culture
| side of the role.
| soumyadeb wrote:
| Such a great read. Have been in this position in a large public
| org. Over a year was spent just creating a catalog of what all
| data the company has and figuring out how to pull them into a
| data-warehouse
| waynesonfire wrote:
| TLDR, refine your thoughts.
| oliv__ wrote:
| Refine your mind
| te_chris wrote:
| This is a good write-up, but for the sort of insights they're
| getting they're over staffed and overpaying. A combination of a
| cloud dw (big query, e.g), cloud etl (stitch, fivetran) and dbt
| for the T in ELT to build useful reporting tables, along with
| some sort of sql based BI (mode, in our case), could deliver the
| same insights for a fraction of the price. Throw in a sub to Heap
| or similar for ad-hoc product analytics as a cherry on top.
|
| I concede, of course, that they're rescuing a bad situation, not
| starting from scratch, but still.
| mindvirus wrote:
| This is a wonderful article, thank you for sharing. I really like
| the narrative of bringing people with you on the journey, and
| celebrating the small wins that lead to a good long term outcome.
| plaidfuji wrote:
| So many gems in this article...
|
| > You notice a a lot of the code starts with very complicated
| preprocessing steps, where data has to be fetched from many
| different systems. There appears to be several scripts that have
| to be run manually in the right order to run some of these
| things.
|
| > "We need to focus on delivering business value as quickly as
| possible", you say, but you add that "we might get back to the
| machine learning stuff soon... let's see".
|
| So so relatable. But the key insight is a really really key
| insight.
|
| > What I think makes most sense to push for is a centralization
| the reporting structure, but keeping the work management
| decentralized. Why? Primarily because it creates a much tighter
| feedback loop between data and decisions. If every question has
| to go through a central bottleneck, transaction costs will be
| high. On the other hand, you don't want to decentralize the
| management. Strong data people want to report into a manager who
| understands data, not into a business person.
|
| I have the same role at a non-software company, and to me this is
| nothing short of a complete reimagining of IT. It's not just,
| "make sure everyone's computer works and help them install
| software," it's, "build a model of the business, determine what
| information flows and metrics are crucial to success, and build
| an IT and analysis infrastructure around that model." The CIO
| will soon be better thought of as the Chief Optimization Officer.
| plank_time wrote:
| This is probably the singly best written and most realistic
| article I've read on HN ever and I've been on HN for a long long
| time. It's so realistic I wonder if the author took it from his
| diary or something. Everything about it is supersaturated with
| authenticity and teaches better than any other article I've read.
| Kudos to the author, and I would love to see this style of
| article take off.
| maileslin wrote:
| Erik is a legend in the modern data world. Wrote Luigi and
| built Spotify's first recommendation engine. He has the ground-
| level experience to lean on
| alexpetralia wrote:
| His post on Berkson's Paradox is excellent!
| zippy5 wrote:
| This was wonderfully written and if your gonna start a data team,
| this is how you do it. But I can see that I'm the only one who
| thought it was crazy to start a data team in the first place.
|
| This company makes 10M and spends 3M on the team and
| infrastructure to make data a core competency?
|
| A vast majority of wins discussed were lowly differentiated web /
| mobile / supply chain analytics which they could have gotten and
| setup with 3rd party software for an order of magnitude cheaper.
|
| I can only imagine what this hypothetical startup could have
| learned if they spent that money actually talking to customers,
| and running more experiments.
|
| I've heard people talk about data as the new oil but for most
| companies it's a lot closer uranium. Hard to find people who can
| to handle / process it correctly, nontrivial security/liabilities
| if PII is involved, expensive to store and a generally
| underwhelming return on effort relative to the anticipated
| utility.
|
| My take away was that startups benefit tremendously from a data
| advisor role to get the data competency, as well as the
| educational and cultural benefits, but realistically the data
| infrastructure and analytics at that scale should have been
| bought not built. Obviously there are a couple of exceptions such
| regulatory reasons like hippa compliance for which building in-
| house can be the right choice if no vendor fits your use case.
| roenxi wrote:
| Having _unique_ data is quite valuable. If your organisation
| can make decisions based on signals that other people can 't
| detect then it can gain a decisive edge.
|
| I do wonder at the anecdotes in this article though. In
| businesses that I've seen, the data team is usually the biggest
| impediment to a data-driven culture because they have databases
| full of numbers and no real grasp of how that links to the
| decision making process that makes the business money.
|
| Beefing up the team doesn't help. In data, as in business more
| generally, the important think is not trying to guess what job
| your doing and spend a lot of time talking to customers about
| what job they need done. If the data team is where that work
| happens in a business then that can be helpful - but the grunt
| work of SQL/reporting/basic analysis is almost never where the
| value appears from.
| chupchap wrote:
| > it's a lot closer uranium
|
| Love this analogy!
| lifeisstillgood wrote:
| As someone who reaches for code if they need to blow their
| nose, what is a 3rd party vendor going to supply that a
| "English-to-SQL translators" wont do?
|
| (I have not finished the article, but the idea that devs / data
| scientists can be replaced by some vendors makes me wonder what
| I have missed)
|
| Edit: Also love the Uranium quote :-)
| zippy5 wrote:
| So my assumption is that for a given business model, like
| e-commerce or Saas business much of the highest value
| analysis is fairly standardized and can be templated. For
| example breaking down conversion rate by weekly cohort is
| something that can be pretty easily be done in google
| analytics.
|
| The problem with English to sql translators or most coders in
| general are the assumptions we make, in particular about the
| underlying data. For example, say we want a join two tables,
| so we write a query to join on two columns and often call it
| correct which it is from a logical or schema perspective it
| is. However, null values, defaults like 0, many to one
| relationships vs one to one relationships, issues with
| instrumentation such as networking timeouts or bot detection,
| etc all can impact the down stream metrics. My point is that
| when there are 500 lines of sql in a query such as those
| mentioned the article, there's a lot of ways to be mostly
| correct but to cumulatively be wrong.
|
| Like many popular enough open source tools, 3rd party vendors
| get battle tested, issues get found before you, and they can
| justify devoting more resources to rigorously ensure
| correctness than the average analyst has the time or energy
| todo because their business depend on you trusting the
| outputs.
|
| I'm not saying you couldn't do all this yourself. But given
| the sheer number of analytics tools that are reasonably
| priced, you might have chosen to spend your time on something
| more specialized like a recommendation system.
| lifeisstillgood wrote:
| can you point me at some of the vendors - I am missing a
| chunk of knowledge i suspect.
|
| Or is this - for exmaple - people taking google analytics
| and producing analysis on top of that.?
| somberi wrote:
| +1. @Zippy - May I ask for some of the vendors you refer
| to, please?
|
| Also love the Uranium analogy.
| jiaweihli wrote:
| Highly recommend Heap [1] - they have a neat approach
| that doesn't require you to 'decide' which analytics you
| want to track ahead of time.
|
| Disclaimer: I was an early engineer at Heap.
|
| [1] https://heap.io/
| Dyac wrote:
| Heap might be good but they are crazy expensive. We were
| quoted something like a quarter million dollars. Good
| luck getting that signed off, plus you still need quite
| technical analysts to run the thing.
|
| I've found https://contentsquare.com/ to be much better
| received by juniors and seniors alike, and it's a
| fraction of the cost of heap.
| lifeisstillgood wrote:
| Ah, so these do do web analytics on users - ok. That
| makes much more sense.
| jiaweihli wrote:
| I don't know the specifics of what you were quoted, but a
| quarter million dollars (guessing per year?) does strike
| me as high.
|
| Were you a later-stage startup by chance? The price point
| for pre-Series-C startups should be much, much lower.
| tomrod wrote:
| That's odd. Why would you charge more for a post-series C
| startup or enterprise versus a pre-series C?
| jiaweihli wrote:
| That's generally how pricing works for SAAS products -
| most later stage customers have stricter or more
| customized needs. Think support SLAs, SSO, ACLs for their
| employees, etc.
| grvdrm wrote:
| +2 on that! would love to know about what you think is
| worth investigating @zippy5
| fouc wrote:
| > spends 3M on the team and infrastructure
|
| You're making a pretty big assumption on cost of team &
| infrastructure there. This company could have 100+ people with
| that kind of revenue (I've worked at a company this size
| before). The data team is only about 6 people. The cost of the
| data team & infrastructure is likely less than $1M
| GlennS wrote:
| I liked this article, but I have two questions:
|
| 1. Is it definitely a good idea to build a separate data team,
| rather than embedding people with analytics knowledge in feature
| teams?
|
| Is it possible to do the latter, but still have end up with a
| well-curated source-of-truth for your data?
|
| 2. Is A/B testing and driving your business by metrics really a
| good idea?
|
| My (uninformed) impression is that data-driven is responsible for
| rather a lot of rot:
|
| - Extremely irritating websites.
|
| - Businesses ignoring important things because they can't measure
| them. (Financialisation, hand-in-hand with the MBA types the
| author decries.)
| alzaeem wrote:
| I share the frustration with how many A/B testing driven
| development processes end up. Leads to a very iterative process
| with lots of small changes, rather than big bets. Also, trying
| to get statistical significance from iterative changes when you
| don't have a ton of data is problematic.
| iamacyborg wrote:
| I think that's just down to a lot of folks who think ab
| testing is the answer to every problem not necessarily having
| a background in maths or stats. I see it all the time in
| marketing teams where people's are so conditioned to think of
| testing as the default that they don't understand what
| they're doing or why.
| dijksterhuis wrote:
| > Is it possible to do the latter, but still have end up with a
| well-curated source-of-truth for your data?
|
| It's important to get the core centralised data infrastructure
| up and running (even if it's dirty af) as that helps with the
| bulk of the data work.
|
| The oft quoted not completely true but kinda true statistic is
| that 70% of data work is finding, cleaning and storing the
| data. Analysis and modelling is the easy bit.
|
| You _could_ do it the other way around. Hire some data people
| in each team and get them to meet up every once in a while.
|
| But I'd wager the central data stuff that makes _everyone 's_
| life easier will get pushed back behind the "urgent" team work
| every time.
|
| #ConwaysLaw
|
| Edit: it's possible to do both btw. E.g. Have a bunch of
| centralised data engineers that do the heavy lifting stuff.
| With data scientist/analysts embedded in teams doing the fine
| grained modelling stuff. It's not a binary choice (once things
| are up and running).
|
| > My (uninformed) impression is that data-driven is responsible
| for rather a lot of rot.
|
| I agree! I was talking to someone else (not a tech head) the
| other week and realised why they hate tech so much... User
| interfaces that just... Don't work.
|
| Showed him a terminal cli and he went nuts over it.
|
| Then again, we're two kinda weird ye olde "back in my day"
| kinda people... So...
| dgb23 wrote:
| Interesting. I'm a bit of a hybrid, CLI/GUI user. There are
| things that I find easier to to in a CLI (or with text in
| general) and things were a GUI is more natural.
|
| CLIs are finicky and force you to think in terms of text,
| whether it is appropriate or not. GUIs can be more expressive
| and haptic, but are typically very idiosyncratic and can get
| in the way of things.
|
| The data-driven approach to UI seems a bit crazy?
|
| If I think about the problems of any UI, I think in terms of
| communication, intent, learning, psychology and aesthetics.
| All of those things are human to human or human to computer
| related issues.
|
| I think data-driven (as in statistical data derived from user
| behavior) approaches are or can be useful in terms of "what"
| to present, prioritize and so on. But much less so on "how",
| because I think this should be based on experiences derived
| from direct interaction and needs to be induced by
| creativity.
|
| And I mean creativity from both sides, the implementer _and_
| the user. One thing that CLIs generally do better is to
| provide composable tools within a adaptive and simple system
| (pipes, text etc.), whereas it is hard to impossible to let
| GUIs talk to eachother and compose them to a user tailored
| whole.
|
| I think we should empower "non-technical" users with the
| freedoms and sound principles we have come to enjoy
| ourselves, instead of letting statistical data dominate their
| experience.
| oliv__ wrote:
| No snark implied but what a great ad for the author!
|
| This was very fun to read, and an interesting window into the
| processes and inner workings of a startup that size.
| neighbour wrote:
| Excellent article. For me, the timing couldn't be better as I am
| about to step into a role not too dissimilar to the one described
| in the piece. It will be interesting to see if I run into many of
| the situations the author describes.
| div3rs3 wrote:
| Done well (like here), The Goal like storytelling, is both
| educational and interesting.
| nerdponx wrote:
| This is an incredibly valuable writeup. Great job.
___________________________________________________________________
(page generated 2021-07-09 23:03 UTC)