[HN Gopher] Show HN: Object Detection in an Hour
___________________________________________________________________
Show HN: Object Detection in an Hour
Author : kekeblom
Score : 76 points
Date : 2021-08-07 17:38 UTC (5 hours ago)
(HTM) web link (www.strayrobots.io)
(TXT) w3m dump (www.strayrobots.io)
| kekeblom wrote:
| Hi HN! hietalajulius and I have been working on a toolkit for
| solving computer vision problems.
|
| These days, there are a lot of fancy solutions to many computer
| vision problems, but there aren't good implementations of the
| algorithms, getting to a working solution requires figuring out
| lots of different steps, tools are buggy and not well maintained
| and often, you need a lot of training data to feed the
| algorithms. Projects easily balloon into months long R&D
| projects, even when done by seasoned computer vision engineers.
| With the Stray Robots toolkit, we aim to lower the barrier for
| deploying computer vision solutions.
|
| Currently, the toolkit allows you to build 3D scenes from a
| stream of depth camera images, annotate the scenes using a GUI
| and fit computer vision algorithms to infer the labels from
| single images, among a few other things. In this project, we used
| the toolkit to build a simple electric scooter detector using
| only 25 short video clips of electric scooters.
|
| If you want to try it out, you can install the toolkit by
| following the instructions here:
| https://docs.strayrobots.io/installing/index.html
|
| Going forward we plan to add other components such as 3D keypoint
| detection, semantic segmentation and 6D object pose estimation.
|
| Let us know what you think! Both of us are here to answer any
| questions you may have.
| notum wrote:
| Looks great! Very innovative approach. Are the generated models
| compatible with OpenCV OAK camera?
| tadeegan wrote:
| This is only beneficial in static scenes right? Otherwise you
| can't get free labels across the whole video.
| kekeblom wrote:
| Yes it relies on the target being static when capturing the
| training data, but it's ok for the background to move. We were
| actually surprised by how well it works on moving objects
| without being trained on them. In the post you can see Julius
| riding on a scooter and that is an unseen example with a
| detector that was only trained on static scooters.
| tvirosi wrote:
| Cool article! Those clips are from sweden right? :) Just curious
| if I spotted it right
| kekeblom wrote:
| Close! They are actually from Helsinki, Finland.
| jonatron wrote:
| Using video to automatically build a large training set is smart!
| Well done! I was thinking about making a properly free and open
| dataset from just walking around London, and this gives me some
| ideas...
| drooby wrote:
| Reminds me of: "How would a human do it"
| AndrewKemendo wrote:
| Just so I understand the idealized pipeline here, a user does the
| following:
|
| 1. Use the Scanner app to take the images and camera pose data
|
| 2. Export the scene directory (color and depth images and json
| files) somehow to your computer
|
| 3. Import (integrate, open) the directory via the Stray CLI
|
| 4. Annotate voxels via 3D bounding box in Studio GUI
|
| 5. Generate labels from the annotated voxels
|
| 6. Import data and labels, train and test a detectron model with
| pytorch
|
| 7. Export trained model in torchscript format
|
| 8. Profit
|
| I assume you require users to "ETL" the scene directory from your
| phone to your desktop/laptop via some manual transfer process?
|
| Is there any reason I couldn't stop at step 5 and push my new
| labeled date to my own training system?
| kekeblom wrote:
| Pretty much yeah. Just to be clear, we only use the color and
| depth images from the camera. There is actually an offline
| calibration step to obtain camera intrinsic parameters, which
| are copied into each scene.
|
| The integrate step runs a SLAM pipeline to compute the
| trajectory of the camera. Then we run an integration step to
| obtain the mesh.
|
| Our core philosophy is to not stand in the way once you want to
| do something custom. So totally, if you want to just read the
| camera poses and 3D labels and do your own thing, you can
| totally do that and the data is available in each scene folder.
| rocauc wrote:
| Do you comprehend how the tool reduces labeling time in (4) and
| (5) (compared to labeling with eg CVAT) as the post claims?
| nathan_phoenix wrote:
| Just curious, what's the business plan behind this?
| kekeblom wrote:
| We plan to charge for some of the algorithms. Also, eventually
| some parts will run in our cloud and we could charge for
| compute credits.
| posix_compliant wrote:
| Heads up for anyone else, I was interested in the strayscanner
| app to try on my iPhone 11, but I'm getting an error when trying
| to record: "unsupported device: this device doesn't seem to have
| the required level of ARKit support".
| kekeblom wrote:
| Ah yeah. The app store doesn't seem to have a way to restrict
| downloads to only lidar devices. The description does mention
| the limitation, but there doesn't seem to be a way to set a
| hard constraint. So sorry about this! Wonder if there is a way
| to issue refunds on the app store.
| kekeblom wrote:
| It seems that only Apple can issue refunds on purchases and
| not developers.
| posix_compliant wrote:
| No worries! It listed my device under the App Store
| compatibility section, so I assumed it was compatible. You
| might be able to modify that section.
___________________________________________________________________
(page generated 2021-08-07 23:00 UTC)