[HN Gopher] Building an AI that watches rugby
___________________________________________________________________
Building an AI that watches rugby
Author : reddavis
Score : 76 points
Date : 2025-04-17 10:18 UTC (12 hours ago)
(HTM) web link (nickjones.tech)
(TXT) w3m dump (nickjones.tech)
| xnx wrote:
| Better title: "LLM OCR on Rugby screenshots to read score and
| clock"
| djtango wrote:
| I don't quite get how diffing frames allows you to find the
| scores.
|
| TFA mentions comparing a frame with and without - but how do you
| generate that frame without? If you can already do it, what's
| useful about doing that?
| barbegal wrote:
| I think the text is wrong, it's diffing two frames and the
| areas that are the same are where the scorebaord is as this
| doesn't change between frames but everything else does.
| nirvael wrote:
| I was also confused by this. I think you're right, but in the
| original text they specifically mention a 'static background'
| that they remove, so it's not just a simple 'wrong way round'
| error, it's a fundamental misunderstanding of what's
| happening. Makes me wonder if the author actually knew what
| they were doing, or just using an LLM to vibe-code
| everything.
| sebastiennight wrote:
| He's diffing the frames, and then the only pixels that stay the
| same are the UI, from which he doesn't directly get the UI (see
| the example, it's illegible) but he can extract the POSITION of
| the UI on the screen by finding all the non-red pixels.
|
| And then he does a good ol' regular crop on the original image
| to get the UI excerpt to feed the vision model.
| itissid wrote:
| Why does yolo not work?
| mon_ wrote:
| Why the focus on scorekeeping? I feel like an AI model is
| overkill here, when you have text-based sources readily available
| such as news apps, Twitter feeds, and apps such as Livescore
| which would be easier and cheaper to scrape. They probably cover
| more matches that aren't televised too.
|
| I'd be curious to see what useful insights could be gleamed from
| the match commentary. You have the main commentator giving play-
| by-play objective reporting and then a 'colour' commentator
| giving some subjective analysis during breaks in play. I bet
| there's a lot of interesting ways this could be used.
| dmurray wrote:
| The only interesting part of the model's output was
|
| { "current_play": "ruck", }
|
| So the vision model can correctly identify that there's a ruck
| going on and that the ball is most likely in the ruck.
|
| Why not build on this? Which team is in possession? Who was the
| ball carrier at the start of the ruck, and who tackled him? Who
| joined the ruck, and how quickly did they get there? How
| quickly did the attacking team get the ball back in hand, or
| the defending team turn over possession? What would be a good
| option for the outhalf if he got the ball right now?
|
| All of these except the last would be straightforward enough
| for a human observer with basic rugby knowledge going through
| the footage frame by frame, and I bet it would be really
| valuable to analysts. It seems like computer vision technology
| is at a stage where this could be automated too.
| MuffinFlavored wrote:
| ESPN has play by play stuff for free like this on their
| website for some other sports
|
| not sure if it is done by a human or not
|
| curious how "an AI can do it" yields much difference in terms
| of result for the casual watcher
| brookst wrote:
| If this is the final product, not much difference at all.
|
| But where the human version is pretty much as far as it's
| going to go, this is v0.01 of the AI version. Pretty soon
| the AI will be predicting what will happen next, commenting
| on whether this was a good idea (based on statistics), and
| letting the viewer ask questions about what exactly
| happened and why.
| dmurray wrote:
| > curious how "an AI can do it" yields much difference in
| terms of result for the casual watcher
|
| An AI can do it in volume, and therefore cheaper. I don't
| think a human could do everything I said _in real time_ -
| maybe with a lot of training and custom software.
|
| A human could transcribe the scoreboard, but the article
| still thinks that's an interesting application of cutting-
| edge machine vision.
| thom wrote:
| Humans can do _most_ of what you said in real time, both
| providers using bespoke software and club analysts using
| off the shelf stuff like Sportscode. For full positional
| data on every player, every frame then yes, computer
| vision is doing most of the work but the quality isn't
| always great. Providers with in-stadium multi-camera
| systems provide great data, but you don't necessarily
| have access to the size of dataset you'd want for
| recruitment, and so lower-quality broadcast tracking
| exists (with all the problems you can imagine like
| missing players, occlusions, crazy camerawork etc). Most
| clubs also have wearables for their own analysis. Almost
| every fully automated broadcast tracking solution has hit
| a wall (sometimes on the first day of a season) in terms
| of quality that is often only solved by human QA, or by
| just discarding some games, so this is far from a
| completely solved problem. Fun domain to work in, but
| lots of horrible edge cases.
| thom wrote:
| Multiple companies sell Rugby data of various levels of
| granularity. I don't know if rugby has all the toys (i.e.
| full tracking outside of wearables) that soccer or American
| football have because there's less money sloshing around.
| jollygoodshow wrote:
| Most pros now have the vests, but also they tend to have
| additional tech in their mouth guards. This is mostly for
| CTE monitoring, but I imagine that there's other data that
| can be extracted
| ookdatnog wrote:
| The AI's job as described in this article is two-fold:
|
| - The relatively trivial task of extracting textual data from
| the screen.
|
| - The task of obfuscating that they're publishing other
| people's work as their own.
|
| When I clicked the article I assumed they'd try to
| automatically construct analysis of the game by using AI to
| analyze frames of the game, but that's not what they are doing.
| They are extracting some trivial information from the frames,
| and then they process the audio of the referee mic and
| commentary.
|
| In other words, the analysis has already been done by humans
| and they just want to re-publish this analysis as their own,
| without paying money for it. So they run it through an AI
| because in today's legal environment this seems to completely
| exempt you from copyright infringement or plagiarism laws.
| brookst wrote:
| Perhaps the most surprising thing about the whole LLM
| revolution is how quickly attitudes about IP have shifted in
| the HN and similar communities.
|
| A few years ago, media companies were rent-seeking parasites
| who leveraged the jack-booted thugs of law enforcement to
| protect an artificial monopoly using IP laws that were
| massive overreach and contrary to the interests of humanity.
|
| Today, suddenly, media companies are pillars of society whose
| valuable contributions must be protected from the scourge of
| theft by everything from VC backed AI companies to armchair
| hackers who don't respect the sanctity of IP.
|
| It's amazing how mutable these principles are. I'm sure
| plenty of people are somewhere between the two extreme, but
| the shift is so dramatic that I am 100% sure many individuals
| have completely revised their opinions of IP companies based
| largely on worries about their own work being disrupted.
|
| At the very least it should create some empathy for the
| lawyers and business folk we all despised for their rent-
| seeking blah blah blah. They were just honestly espousing the
| positions their financial incentives aligned them to.
| bondarchuk wrote:
| How do you know you're seeing peoples' opinions change, and
| not just a change in which people express their opinions?
|
| That said I'd personally be happy if LLMs cause the death
| (or drastic weakening) of copyright and IP laws, however as
| it is now, with no copyright for AIs but the same old
| copyright for humans, it's the worst of both worlds.
| Workaccount2 wrote:
| I know people personally with strong gripes about AI
| "infringement" (in quotes because I believe people are
| just confused about how these models work), and every
| single one of them -100%- have a stash of pirated media
| they casually accumulated over the years.
|
| People are in it for themselves. When you are young
| everyone has righteous ideals, but then trends of society
| eventually ebb, and you realize that just about everyone
| was simply virtue signalling, and few people are
| committed even to their own detriment.
|
| 2005: "End copyright! Trash IP law! Liberate media!"
|
| 2025: "Strengthen Copyright! Extend IP Protection!
| Protect makers!"
| bondarchuk wrote:
| > _I know people personally with strong gripes about AI
| "infringement" (in quotes because I believe people are
| just confused about how these models work), and every
| single one of them -100%- have a stash of pirated media
| they casually accumulated over the years._
|
| I don't know them, of course, but it is a consistent and
| imho reasonable position to be against copyright yet,
| while we normal people live in fear of copyright, ask for
| it to be applied to AI as well.
|
| It is even reasonable IMO to be against copyright for
| individuals but in favour of copyright for businesses.
| That's how it de facto works in a lot of places anyway.
| ookdatnog wrote:
| Not commenting on general trends, but I don't think my
| opinion on IP shifted massively as a result of the rise of
| LLMs. I can summarize it as follows:
|
| - It seems desirable to have _some_ system that allows
| creatives to be paid for their work.
|
| - Whether current IP law is the best system we can come up
| with is highly debatable. But nevertheless it is the system
| we have, and its existence is to some extent justified.
|
| - If we look at the "pefect case" where IP law functions as
| intended (for example, an author publishes a book in which
| they invested years of their life), then breaking IP law
| (sharing that author's work without their consent) in that
| instance seems, to me, immoral.
|
| - Nevertheless there are plenty of excesses in the system
| where I would judge that the application of IP law is
| unjustified and breaking the law is morally justified
| (naturally I still don't recommend it). This includes, for
| example, paywalled papers from publicly-funded research,
| works that can no longer reasonably be purchased (for
| example games for old consoles), most if not all software
| patents, ...
|
| So the question simply boils down to: is sports commentary
| justifiably protected under IP law? I think the answer is a
| pretty clear-cut "yes" here, I don't see how it falls under
| any case of IP law overreach.
| walthamstow wrote:
| I'm not a rugger bugger but every 5 seconds doesn't really seem
| like often enough to be taking screenshots. In soccer anyway, a
| lot can happen in 5 seconds.
| conductr wrote:
| My american football brain had the same reaction. Many of the
| most pivotal plays are replayed in slow motion as commentators
| and spectators debate on what actually happened and if the refs
| got the call right. Also, the average play (ie. 'down') is 4-5
| seconds, so not nearly enough data to determine what is going
| on.
| dncornholio wrote:
| So Rugby is missing a lot of data beside the scoreline, so they
| created an AI that can extract the scoreline.
| patapong wrote:
| I want AIs that clean my apartment while I watch rugby, not AIs
| that watch rugby while I clean the apartment. ;)
|
| In seriousness, this is a cool project and show how sophisticated
| analysis LLMs can do in a plug and play manner. They may not
| always be the best solution but a fantastic baseline that can be
| deployed and adapted to a usecase in less than an hour.
| petesergeant wrote:
| > We can't hire analysts to watch every match and enter data
| manually.
|
| I'm surprised there's not enough fans willing to do that if you
| could gamify it.
| securingsincity wrote:
| This is a position in baseball.
| https://www.wbur.org/news/2025/03/30/fenway-park-boston-base...
| Here's a radio piece about the official fenway park score
| keeper from two weeks ago
| goeiedaggoeie wrote:
| Reading the scoreboard from a TV screen and selling that data is
| restricted in many jurisdictions. This work is pretty naive I
| think.
| brookst wrote:
| Has there ever been a hacker whose top priority is ensuring
| compliance with every regulation in every jurisdiction
| worldwide?
| stronglikedan wrote:
| Good thing they're in only one jurisdiction, not many.
| hash872 wrote:
| I don't think it's possible to be in compliance with every law
| in every jurisdiction simultaneously. There are over 300,000
| federal laws in the US, and apparently no one knows how many
| laws each of the 50 states has. That's 1 of the world's 195
| countries
| sebastiennight wrote:
| I love that as soon as he writes,
|
| > The plan was simple.
|
| You know you're in for a funny read.
|
| More seriously though, the JSON example from a vision language
| model is interesting but does not take into account how much
| extrapolation (hallucination) the model will insert over time.
|
| For instance, even if not visible in the image, your VLM will
| probably start inserting details (such as the color of the team's
| jersey) based on knowing the team's three-letter identifier.
|
| So the reliability of the system will go down over time, and it
| probably compounds if you're using some of that info to feed
| further steps in the loop.
| hummuscience wrote:
| The moment I started reading this, I got reminded of this recent
| study: https://arxiv.org/html/2503.10212v1
|
| The scope is a bit different. The study uses an LLM to interpret
| pose estimation data and describe the behavior in each frame. The
| output is text which can be used to create embeddings of
| behavior. As someone who works in ethology, that's a clever (but
| maybe expensive) idea.
|
| I think the author could use something similar. With multi-person
| pose estimation models.
| chrsw wrote:
| Does this mean there's probably AI that's already watching high
| profile football (soccer) matches?
| thom wrote:
| Depends on your definition of AI, but yes, lots of them, and
| not just the high profile matches.
| damnitbuilds wrote:
| TL;DR: It extracts the score from the video and gets text from
| the commentary in the audio.
|
| I was hoping for more.
| disjunct wrote:
| I wrote a similar script that used a TV tuner during the last
| World Cup. Since I had an ATSC source, I was able to just pull
| the CTA-708 captions directly and with little delay.
| numpad0 wrote:
| aw. I thought this would be about an AI cat that makes wrong
| commentaries that you can make pointless arguments against. There
| should be one.
| cewl123 wrote:
| I want AI that does my job while I watch rugby
| rfdearborn wrote:
| > Sending a full-resolution screenshot every five seconds gets
| expensive fast.
|
| For now.
| 4ndrewl wrote:
| My observation is that watching Rugby on TV is no the same as
| watching a Rugby match. You're watching something where choices
| have been made around what you're to see, so your model is
| already restricted in what it can see.
|
| You really need to take a 'full pitch' feed directly from the
| venue, rather than what is broadcast.
___________________________________________________________________
(page generated 2025-04-17 23:01 UTC)