[HN Gopher] YOLOv7: Trainable Bag-of-Freebies
___________________________________________________________________
YOLOv7: Trainable Bag-of-Freebies
Author : groar
Score : 46 points
Date : 2022-07-16 19:27 UTC (3 hours ago)
(HTM) web link (arxiv.org)
(TXT) w3m dump (arxiv.org)
| SrslyJosh wrote:
| > the highest accuracy 56.8% AP among all known real-time object
| detectors with 30 FPS or higher
|
| Yikes. It's not clear to me if that's the upper limit on accuracy
| or a limit imposed by requiring that it run at 30 FPS, but
| still...yikes.
| JustFinishedBSG wrote:
| It's clearly the latter and I don't see why it would be
| "yikes". Real time detectors are useless if "real time" means
| 1fps.
| SrslyJosh wrote:
| What good is speed if the accuracy isn't significantly better
| than a coin flip?
|
| From the paper:
|
| > For example, multi-object track- ing [94, 93], autonomous
| driving [40, 18], robotics [35, 58], medical image analysis
| [34, 46], etc.
|
| LOL, these are all great use cases for a model with < 60%
| accuracy!
| IncRnd wrote:
| In YOLOv7, YOLO and v7 don't go well together. No, not at all.
| YOLO normally means "You Only Live Once", and v7 means it's lived
| at least six times before this.
|
| While the author likely didn't have that intention, that's what
| came across.
|
| Even for YOLO meaning "You Only Look Once" YOLO and v7 do not go
| together well.
| gchq-7703 wrote:
| YOLO in this case stands for "You Only Look One".
| IncRnd wrote:
| Yes.
|
| The point I was making is that YOLO and v7 don't go well
| together, and that is true for either meaning of YOLO.
| Dayshine wrote:
| Huh? It means that the approach is to only process the
| input image frame once, I.e. "look". And this is the 7th
| implementation of that algorithm.
|
| It's not as if this is named "the final algorithm v7"
| isoprophlex wrote:
| Github repo mentions "teaser: Yolov7-mask" showing segmentation
| as well. Highly relevant to my interests. Sadly I can't easily
| discern any other info on this topic.
|
| Anyone knows any more, maybe?
| hwers wrote:
| What are you using it for if can share? I've thought about
| training some of these and releasing the weights but I've never
| found a reason they'd really be useful personally so it never
| really happened
| kylevedder wrote:
| Probably the most interesting trick from the paper is using the
| head as a soft supervisor for earlier layers of the network, with
| the intuition being that if the earlier layers learn to imitate
| the higher capacity later layers, it frees up the capacity of the
| later layers to better learn the residual and provides more dense
| supervisory signal.
| squarefoot wrote:
| As someone who got only his feet wet with OpenCV like 20 years
| ago, so basic shape recognition and no AI involved, what
| read/software, etc. would you suggest to catch up and play with
| current technology without being inundated by theory that I'm
| sure I couldn't grasp?
| anewpersonality wrote:
| We should stop calling it YOLO after the creator quit machine
| learning.
| isoprophlex wrote:
| Especially hilarious considering some other people ALSO jumped
| on the "we made an object detector so let's call it YOLOvX"
| wagon and released...
|
| Something called YOLOv7.
|
| https://github.com/jinfagang/yolov7
___________________________________________________________________
(page generated 2022-07-16 23:00 UTC)