[HN Gopher] Samurai: Adapting Segment Anything Model for Zero-Sh...
___________________________________________________________________
Samurai: Adapting Segment Anything Model for Zero-Shot Visual
Tracking
Author : GordonS
Score : 76 points
Date : 2024-11-26 10:16 UTC (4 days ago)
(HTM) web link (yangchris11.github.io)
(TXT) w3m dump (yangchris11.github.io)
| GordonS wrote:
| Full, unabridged title (which adds something important!):
|
| "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual
| Tracking with Motion-Aware Memory"
|
| It's the memory part that I find so impressive in the demo
| videos!
| IshKebab wrote:
| Very impressive. I wish research like this was more deployable.
| It always seems to come in the form of a muddy ball of Python,
| rather than e.g. a C++ or Rust library you could actually deploy
| in a product.
|
| I get why, but it still seems a shame that there's all this cool
| ML research that will only make it into actual products in 10
| years when someone with the resources of Adobe rewrites it in
| something other than Python.
| HanClinto wrote:
| I work on deployed embedded ML products using NVidia Jetson,
| and while there are C++ portions, a lot of it (dare I say most
| of it?) is written in Python. It's fast enough for our embedded
| processors, and Docker containers makes such things very
| deployable -- even in relatively resource-constrained
| environments. No, we're not on a Raspberry Pi or an Arduino,
| but I don't think that SAM2 is going to squeeze down reasonably
| onto something that size anyways.
|
| If the inference code (TensorRT, Tensorflow, Pytorch, whatever)
| is fast, then what does it matter what the glue code is written
| in?
|
| Python has become the common vulgate as a trade language
| between various disciplines, and I'm all 'bout that.
|
| I've only been working in computer vision for 10-ish years, but
| even when I started, most research projects were in Matlab. The
| fact that universities have shifted away from Matlab and into
| Python is a breath of fresh air, lemme' tell ya'.
| stefan_ wrote:
| > a lot of it (dare I say most of it?) is written in Python
|
| I guess ignorance is bliss once someone has done the work for
| you of getting it all down into TRT.
| Grosvenor wrote:
| TIL Vulgate was a Latin version of the bible.
|
| From Apple dictionary:
|
| "the principal Latin version of the Bible, prepared mainly by
| St. Jerome in the late 4th century, and (as revised in 1592)
| adopted as the official text for the Roman Catholic Church."
| zackangelo wrote:
| I've been writing all of our transformer implementations in
| Rust using the Candle crate and it's been great.
|
| While dealing with CUDA and GPUs on servers is never a joy,
| deploying fully contained Rust binaries instead of a morass of
| python scripts has improved the situation for me significantly.
|
| Getting Samurai running on Candle shouldn't be that large of an
| undertaking. I believe there's already a SAM implementation.
| steinvakt2 wrote:
| Note that this currently only enables single object tracking.
| Tried it for my research project (tracking cells on microscopic
| videos) but it didn't work well. Guess it's more suited for real
| world 3d scenarios
| alberth wrote:
| Seems great for tracking POI on CCTV.
___________________________________________________________________
(page generated 2024-11-30 23:01 UTC)