[HN Gopher] Computer Vision: Algorithms and Applications, 2nd ed
___________________________________________________________________
Computer Vision: Algorithms and Applications, 2nd ed
Author : ibobev
Score : 79 points
Date : 2025-09-27 12:27 UTC (3 days ago)
(HTM) web link (szeliski.org)
(TXT) w3m dump (szeliski.org)
| krapht wrote:
| An excellent book for fundamentals. Still haven't found a good
| textbook that covers the next level, that takes you from a
| student to competent practitioner. Advanced knowledge that I've
| picked up in this field has been from coworkers, painfully gained
| experience, and reading Kaggle writeups.
| bonoboTP wrote:
| It gets specialized after that. You need to be more specific
| about the area you are interested in. Computer vision is a very
| broad field. For newer topics, there are often no textbooks yet
| because it takes time to write books and the methods and
| practices change quite fast, so it takes time to stand the test
| of time. Your best bet is arXiv and GitHub to learn the latest
| things.
|
| Object detection / segmentation, human pose (2D/3D), 3D human
| motion tracking and modeling, multi-object tracking, re-
| identification and metric learning, action recognition, OCR,
| handwriting, face and biometrics, open-vocabulary recognition,
| 3D geometry and vision-language-action models, autonomous
| driving, epipolar geometry, triangulation, SLAM, PnP, bundle
| adjustment, structure-from-motion, 3D reconstruction (meshes,
| NeRFs, Gaussian splatting, point clouds), depth/normal/optical
| flow estimation, 3D scene flow, recovering material properties,
| inverse rendering, differentiable rendering, camera
| calibration, sensor fusion, IMUs, LiDAR, birds eye view
| perception. Generative modeling, text-to-image diffusion, video
| generation and editing, question answering, un- and self-
| supervised representation learning (contrastive, masked
| modeling), semi/weak supervision, few-shot and meta-learning,
| domain adaptation, continual learning, active learning,
| synthetic data, test-time augmentation strategies, low-level
| image processing and computational photography, event cameras,
| denoising, deblurring, super-resolution, frame-interpolation,
| dehazing, HDR, color calibration, medical imaging, remote
| sensing, industrial inspection, edge deployment, quantization,
| distillation, pruning, architecture search, auto-ML,
| distributed training, inference systems,
| evaluation/benchmarking, metric design, explainability etc.
|
| You can't put all that into a single generic textbook.
| greenavocado wrote:
| Plus photogrammetric scale recovery, rolling-shutter &
| generic-camera (fisheye, catadioptric) geometry, vanishing-
| point and Manhattan-world estimation, non-rigid / template-
| based SfM, reflectance/illumination modelling (photometric
| stereo, BRDF/BTDF, inverse rendering beyond NeRF),
| polarisation, hyperspectral, fluorescence,
| X-ray/CT/microscopy, active structured-light, ToF waveform
| decoding, coded-aperture lensless imaging, shape-from-
| defocus, transparency & glass segmentation,
| layout/affordance/physics prediction, crowd & group activity,
| hand/eye/gaze performance capture, sign-language, document
| structure & vectorisation charts, font/writer identification,
| 2-D/3-D primitive fitting, robust RANSAC variants,
| photometric corrections (rolling-shutter rectification,
| radial distortion, HDR glare, hot-pixel mapping),
| adversarial/corruption robustness, fairness auditing, on-
| device streaming perception and learned codecs, formal
| verification for safety-critical vision, plus reproducibility
| protocols and statistical methods for benchmarks
| thenobsta wrote:
| It's astounding how much there is to this field.
| lacoolj wrote:
| This is great, but why is it posted here like it's new? This is
| from 2022
| JohnKemeny wrote:
| There's even a HN post from almost exactly 5 years ago:
|
| Computer Vision: Algorithms and Applications, 2nd ed
| (szeliski.org)
|
| 0 comments
|
| https://news.ycombinator.com/item?id=24945823
|
| But anyway; why not? Yes, add (2020) to the title, by all
| means.
| pthreads wrote:
| It is a good thing that links to useful resources like these
| are reposted every now and then. For many, like myself, this
| could be the first time seeing it. Perhaps a date tag would add
| some clarity for those who have already see it.
| aanet wrote:
| Seen this post on HN so many times..
|
| Would love to see / hear if there are any undergrad/grad-level
| courses that follow this book (or others) that cover computer
| vision - from basic-to-advanced.
|
| Thanks!
| bonoboTP wrote:
| It's right there on the linked website under "Slide sets and
| lectures".
| aanet wrote:
| Thanks
|
| I must be blind
| swader999 wrote:
| This is the right area for you to be in at least.
| dimatura wrote:
| This is a great book - learned a lot from the first edition back
| in the day, and got the second edition as soon as it came out.
| It's always fun to just leaf through a random chapter.
| brcmthrowaway wrote:
| Any updates using AI? One shot camera calibration?
| krick wrote:
| Genuinely curious: is it even still relevant today? I've got the
| impression that there were a lot of these elaborate techniques
| and algorithms before around 2016, some of which I even learned,
| which subsequently were basically just replaced by some single
| NN-model trained somewhere in Facebook, which you _maybe_ need to
| fine-tune to your specific task. So it 's all got boring, and
| learning them today is akin to learning abacus or finding
| antiderivatives by hand at best.
| EarlKing wrote:
| Those NN-models are monstrosities that eat cycles (and watts).
| If your task fits neatly into one of the algorithms presented
| (such as may be the case in industrial design automation
| settings) then yes, you are most definitely better off using
| them instead of a neural net-based solution.
___________________________________________________________________
(page generated 2025-09-30 23:01 UTC)