-
-
Notifications
You must be signed in to change notification settings - Fork 55.8k
2010
Agenda
- General code coverage update
- Moving seldom used functionality to cvaux
- Spreadsheet
- Bug tracking …
- Calibration/Stereo
- Plan of record
- Progress on stability
- Subpixel accurate calibration
- Exposing error estimates
- Expose reprojection errors
- Self-calibration as per Tomas Svoboda
- GSoC
- OpenCV on Android? Can C be wrapped on Android? Doesn’t it need pure Java?
- User poll results here.
- Want more documentation
- !GrabCut … expose soft-points/“hints”
- Also: ~!GraphCut based segmentation with fixation (similar to !GrabCut, but uses log-polar transformed space)
- Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
- Also: ~!GraphCut based segmentation with fixation (similar to !GrabCut, but uses log-polar transformed space)
- Texas — no action right now, code lock down until beta ship late this week.
- Parking is desired, but I doubt it got in
- Also want to click on a spot and go there (use base to get homography from camera to plane and therefor a metric and/or optical flow.
- Plug in … still waiting on an arm person … probably Sachin
-
Texture:
- The paper:A Performance Evaluation of Local Descriptorsby Krystian Mikolajczyk and Cordelia Schmid
- indicates that Patch-Duplets work best closely followed by SIFT. Patch-Duplets is described in (I have the paper):Patch-Duplets for Object Recognition and Pose Estimationby Bjorn Johansson and Anders Moe
- Here’s my basic recognition idea, intended to scale
- The paper:A Performance Evaluation of Local Descriptorsby Krystian Mikolajczyk and Cordelia Schmid
- Presuming an interest operator, such as loose grouping by depth (but even on down to sliding window)
- Use many kinds of descriptors
- (a) 2D texture descriptors, each clustered into a KD tree with coarse clustering
- (b) 3D descriptors, cluster and relate to conic surface types
- © 2D surface type descriptors, cluster related to surface types and surface geometries
- Use local (what works best)
- (a) grids
- (b) histograms
- © pairs
- of such categorical descriptors to identify objects
- ‘’We might want the above to general view as well as object’’
- Once we have a reasonable candidate list of object identities, we then confirm using the full model of that object to confirm object ID and refine the object pose.
- Current Needs:
- Flexible approximate k-means or vocab tree:
- Easy to get children of parents
- Easy to use to index a list of items
- Features:
- Harris over scale
- Duplets
- Straw man SIFT
- Calander
- Flexible approximate k-means or vocab tree:
- Current Needs:
Minutes
- Seldom used functions moved from cv cvaux
- All the code (cxcore and cv) uses C++ exceptions.
- ML and HighGUI next when time. Some from cvaux later.
- Calibration stereo
- Improving test coverage
- Improving accuracy
- Current one calcs grad near the pixel corner (dot prod between radial and gradient corners should be zero)
- But average error of this method is ~0.5 pixels
- Working on some improvement based on quadrangles to ~.39 or ~25% … if not, we need to move to circles
- ’’But … texture recognition engine will probably subsume all this
- Current one calcs grad near the pixel corner (dot prod between radial and gradient corners should be zero)
- Give error as per Jean-Yves: Gradient (Jacabbian) of pixel coordinates to camera parameters and then solve
- Asked Vijay about multi-camera toolbox, whether to reproduce.
- He’ll discuss with us later, this toolbox is sort of overkill for the robot — it assumes general camera pointing which we don’t have
- Bugs
- 4 from Willow,
- one major cxcoretest fail … but it’s old and we have failed to reproduce, so close it for now
- 4 from Willow,
-
GSOC probably in this year
- Android phone, GNA mechanism used by JavaCV, uses libffi.
- iPhone and Android use gcc, should work.
- Maybe just make proof of concept demos
- C wrappers for C++
- How much of this to do? How much to do? How important is alternative wrapper support.
- OpenCV on cloud
- Hadoop but has too much setup time, good for batch processing
- Already supports Amazon EC-2 using Cloudera’s system
- Hadoop but has too much setup time, good for batch processing
- Android phone, GNA mechanism used by JavaCV, uses libffi.
- User group poll
- Top thing was documentation
- Need a priority on C++ since we want to move user based this much nicer system
- Gary is starting a C++ guide and has to go through this for version 2 of the Learning OpenCV Book
- Need a priority on C++ since we want to move user based this much nicer system
- Top thing was documentation
- Need James to automate the publishing of the docs online … do it each day
- GrabCut needs soft points exposed
- Want to also have the fixation point paper
-
Texture
- Will do multi-scale Harris Corner detector
- Need SIFT as a comparison … contribute it as a ROS node
- Coming.
- Finishing Calibration work
- Revamping the documentation needs somewhat of a tutorial/intuition
- Texas parking
- Jan 23-30 vacation for Victor
- Finishing Calibration work
Vadim
The holidays in Russia lasted till the end of the previous week, so there is not much progress.
Here are our results so far:
- Further improved functions and conditions coverage of camera calibration related functionality. As usual, the updated numbers are available here:
http://spreadsheets.google.com/pub?key=tPvCIw1M4fqU9UQCNRTyoLQ&output=html - Modified the whole cv module to use the new-style (exception based) OpenCV error handling mechanism (cxcore was already using exceptions). It fixes one serious problem in OpenCV # 0 – broken error handling, and also allowed us to develop badarg tests.
- Moved old C++ classes CvImage & CvMatrix from cxcore to cvaux. Moved cvCalcPGH, cvCalcImageHomography (left from the gesture recognition effort), cvFindDominantPoints and CvConDensation from cv to cvaux. This functionality is rarely used (and not used in ROS at all), so the change should not affect many people. On the other hand, this will help us to achieve the goal stated in the end of 2009 – as good as possible test and documentation coverage of the key modules cxcore, cv and ml.
Victor
- Chessboard detector:
- an algorithmic test on chessboard finder that uses artificial chessboard generator (recently written by Anatoly) to estimate the corner location error with subpixel accuracy has been created.
- Root-mean-squared error for cvFindChessboardCorners is 0.68 pixels (averaged over 10 different chessboard generated with different sizes and different camera/lens parameters).
- After applying cvFindCornerSubpix with tuned window size error goes down to 0.54 pixels.
- A new version of accurate corner location — specific to chessboard corners — has been implemented (a tentative name is find4QuadCornerSubpix).
- It looks for two black and two white quadrangles in the region of each corner and produces a new corner position from relative positions of the quadrangles.
- The function works well for chessboards with relatively large squares — such as 10×10 chessboard filling at least a quarter of a 640×480 frame.
- The function so far does not calculate positions of some of the outer corners as the neighboring white quadrangles can be a part of the outer white border of the chessboard and this requires some additional processing (that I hope to implement tomorrow).
- The RMS error after subsequent application of cvFindCornerSubpix (for outer corners) and find4QuadCornerSubpix goes down to 0.39 (more than 25%). Much of this error is generated by outer corners.
Action Items
Gary
- Get Victor new outlet images
- Multi-scale multi-corner
- Send CMU approach to Victor
- Set up meeting on Texas
- CC Victor to Dallas
Vadim
- SIFT into ROS Maybe send to Gary
Victor
- Write Dallas
James
- Automate rebuild and putting documentation online
From last time
Gary
- Edit C++ interface tutorial
- (./) Send out log polar “grabcut” paper
Vadim
Victor
- Send texture and geometry API around
- Subpixel accuracy improvements for calibration.
- Adding calibration error estimates
Agenda
- Plan of record
- High Resolution disparity
- Go over documentation
- Patch duplets do best for pose according to
-
A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- Patch-Duplets is described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Though is to use this idea but record the texture patches with ‘’Scale Invariant Feature Transform with Irregular Orientation Histogram Binning’’, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- But done like the Daisy detector
- Daisy detector?
- Fast way of accessing circular regions in images
-
A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
- Dense optical flow
- Maybe with Stereo
- Alternative stereo correspondence algorithm
- Graph cut on GPU
- Block matching variants
- Feature based recognition and pose
- Feature + geometry
- Stereo recognition
- 3d model capture
- Plan:
- Victor on stereo this week
- Maria finish up one more calibration test by end of week
- Maria switch to Stereo on Friday
- Duplets, Gary continue, work with Suat to test
- Victor on vacation next week, when back go onto texture based recognition and pose
Minutes
- cv and cxcore coverage are up
- When to relegate to background
- cv
- stereo correspondence
- Someone dedicated to this for awhile
- texture operators
- MSER
- stereo correspondence
- cxcore — to high coverage
- cv
- When to relegate to background
- Texture operator
- Extremely random trees or
- Vocabulary trees
- Naive Bayes
- Disparity in High res stereo
- Large number of invalid pixels — not finding disparity
- Kurt suggested dynamic programming for whole image
- Stereo processing by semi-global matching and mutual information Hirschmuller
- Argus: Argus for Russian work, Itsees for international work.
- Staffing up for CUDA and Benchmark work
- One month
- Possible for the high res stereo work to be helped by this
- Maybe start Maria on this task
Vadim
- This week we continued working on tests, mainly for cxcore and the calibration part in cv, and fixing some found bugs.
- We’ve got an impressive results on the coverage of cxcore and cv, see the table:
- http://spreadsheets.google.com/pub?key=tPvCIw1M4fqU9UQCNRTyoLQ&output=html
- We’ve got an impressive results on the coverage of cxcore and cv, see the table:
Here are the details:
- Testing:
- Conditional coverage of cvcalibinit.cpp and cvcalibration.cpp has been raised by 8% and 15% respectively Anatoly
- Implemented tests for cv::projectPoints & cvProjectPoints2, which are important elements of camera calibration algorithm Maria
- Implemented tests for cv::undistortPoints and cv::initUndistortRectifyMap; raised coverage of cvundistort.cpp to 100/86 Alexey
- Rewrote unrobust cvSolvePoly (Durand-Kerner algorithm is now used),
- fixed solvePoly test,
- extended some other cxcore tests, bringing cxcore coverage to 91/73 Vadim
- Bug fixing:
- closed tracs #50, #53, #54, #57, #63
- A variant of calibration.c sample with the artifical chessboard generator has been added to SVN Anatoly
Victor
- Chessboard detection: a new test for chessboard detection with subpixel accuracy has been implemented.
- A high-resolution — 4K x 3K — ch image (taken by Radu with canon 40d camera) was run through chessboard detector and then scaled down to 640×480.
- As a result in a low-resolution image we know corner coordinates with accuracy ~0.15 pixels that we can compare to the results of subpixel corner finder.
- The new test revealed several bugs in find4QuadCornerSubpix that were fixed.
- The accuracy for both corner detection methods differ from the corresponding accuracy on artificial chessboards.
- The new function, find4QuadCornerSubpix, shows 45% lower RMS error than findCornerSubpix, 0.43 against 0.78.
**_High-resolution stereo:_calibrated a stereo pair on chessboard images taken by 2 canon 40d cameras. - Calibration shows reasonable results.
- Getting a good disparity image was a challenge and required parameter tuning (and it still has too many invalid pixels).
- Running time for stereo correspondence is 88 second per frame.
-
Other: the supercomputer group at the local university applies for $100K grant from Russian government on machine learning.
- They wanted to implement several classifiers and make them high-performance.
- We agreed that they will submit the code they’ll create to OpenCV (which restricts them not to use any GPL software).
- As a result, they will work (provided they get the money) on implementing and then optimizing/parallelizing gradient boosting tree classifier and latent svm.
Action Items
Gary
- Copy in coverage
- Run interest point detectors on narrow stereo
- Set up meeting with Suat and Victor to discuss texture plan
Vadim
- To Stefano
- Dense optical flow
- Maybe with Stereo
- Alternative stereo correspondence algorithm
- Graph cut on GPU
- Block matching variants
- Dense optical flow
Victor
From last time
Gary
- Get Victor new outlet images
- Multi-scale multi-corner
- Send CMU approach to Victor
- Set up meeting on Texas
- CC Victor to Dallas
- Edit C++ interface tutorial
Vadim
- SIFT into ROS Maybe send to Gary
Victor
- Write Dallas
James
- Automate rebuild and putting documentation online
Agenda
Consortium Company meeting
Minutes
- Tools to facilitate or eliminate the movement of data back and forth
- Target some high compute apps
- Look at work that has already been done — lots of image processing
- Some simple noise reduction etc in CUDA SDK
- MPP ~= IPP but only released in binary
- Make feature list
- Priority list
- Have people who have worked on things such as graph cut
- Priority list
- Work on High Res stereo
- May be able to get a high end stereo rig that is 14 bits and 8M pixel (4Kx2K)
- Maybe use: ’’Stereo processing by semi-global matching and mutual information_Hirschmuller
- Add texture based recognition And/Or augmented reality
- Others
- Automatic segmentation
- Felzenschwalb’s recognition
Action Items
- Link Joe and Texas
- Wants one
- Use of NVidia card for Stereo display
Agenda
- GSOC, what do we want
- Consortium progress
- Stefano Fabri
- Next OpenCV Release, March 30th.
Ongoing
- Code coverage
- Calibration/Stereo
- Bugs tracking …
List of wanted additions
- ’’Stereo processing by semi-global matching and mutual information_Hirschmuller
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.
Minutes
- Combine Daisy like idea, or:
- Some similarities with MOPED from CMU, but
- Coverage of
- cxcore unchanged
- cv 76/65 → 78/66%
- cvaux up by 1%
- calibration coverage 87/80 → 93/83
- Update
- tests, calibration and stereo correspondence work.
- Anatoly couple more function tests in calibration pipeline, perspective transformation, compose RT and others
- Maria started work on stereo correspondence test.
- Trying to reproduce middlebury framework for stereo correspondence, run all the existing functions (Kolmogorov, BM) through these
- Vadim has read and is ready to implement Hirschmuller’s dense stereo matching
- parameters pp1 and p2 weren’t listed (will contact the author)
- compute time and memory: are_O(image size * maximal disparity * 8_)but compute is parallisable
- Huge images need to be processed by tiles
- Stereo is probably more pragmatic at HD, not 8mpix due to lens, apperature and just volume of data.
-
GSOC
- Image stitching — put together a streetview trail
- Denoising, motion stabilization, lighting balance, image enhancement
- Image collage,
- segmentation
- Segmentation in video/video effects
- Substitutes for SIFT
- 3D model, silhouetts + strip
- March 30th release
- Hirschmilleer
- Documentation
- Platform testing: Mac 10.6 (highgui uses old carbon. 64 bit doesn’t compile)
- OpenMP builds
- Install
- Update python bindings to new interface
- Extend python to MLL. Swig bindings are not stable
- Documentation
- Need regular posting
- Many functions are not searched for in the C++ documentation
- Make code submission examples
- Sefano Fabri
- Fast variant of graphcut for stereo
- Consortium, have resumes in
Vadim Report
This week we practically finished working on calibration tests.
- Test coverage progress:
- cv 76/65 => 78/66
- calibration 87/80 => 93/83
- See the table here: http://spreadsheets.google.com/pub?key=tPvCIw1M4fqU9UQCNRTyoLQ&output=html
Here are the details:
- New tests reprojectImageTo3D, composeRT Anatoly
- New big test of calibrateCamera, which uses artificially generated chessboards, was started Anatoly
- New big test of stereo correspondence algorithm started that partly reproduces middlebury framework (http://vision.middlebury.edu/stereo/) Maria
- MSER testing started. mser sample, contributed long ago by Liu Liu, was cleaned up and put as OpenCV sample. Alexey
The work on improving stereo correspondence started.
- Ported x-sobel pre-filtering and speckle post-filtering from stereolib.c to OpenCV Vadim
- Studied H. Hirschmuller paper on semi-global stereo matching, readying to implement it Vadim
Bug fixing
- Fixed 6+ bugs (tickets #53, #69, #72, #87, #88, #89 + a couple of SF bugs)
New Capability:
- Maria modified grubcut demo. Now you can set “likely foreground” and “likely background” pixels by pressing right mouse button with shift/ctrl
- “surely foreground” and “surely background” are set by pressing left mouse button with shift/ctrl. The functionality is there.
- While “likely foreground” does affect the result, setting “likely background” usually does not, because the number of pixels added to the background model is too small, comparing to the whole mass of pixels outside the ROI.
- Perhaps, some weights should be used with these extra pixels, but the current API does not support that yet.
Action Items
Gary
- Edit C++ interface tutorial
Vadim
Victor
From last time
Gary
- (./) Copy in coverage
- (./) Run interest point detectors on narrow stereo
- :( Set up meeting with Suat and Victor to discuss texture plan
- :( Edit C++ interface tutorial
Vadim
- To Stefano
- Dense optical flow
- Maybe with Stereo
- Alternative stereo correspondence algorithm
- Graph cut on GPU
- Block matching variants
- Dense optical flow
Victor
Agenda
- Stereo analysis between stereolib in ROS and OpenCV’s stereo algorithms
Minutes
Vadim
My findings are the following:
- differences between OpenCV and stereolib are mainly due to the parameter interpretation:
- OpenCV uses CvStereoBM::textureThreshold as is.
- stereolib transforms it to state→textureThreshold * state→SADWindowSize * state→SADWindowSize * state→preFilterCap / 100;
OpenCV uses CvStereoBM::uniquenessRatio (denoted uthresh0 below) as a ratio in percentage: *uthresh0/100- stereolib basically uses a half of that – in the beginning of the function it transforms the parameter as:
- uthresh = uthresh0 * 0×8000 / 100; ~ uthresh * (1 << 15) / 100
- and then it uses SSE instruction equivalent to
- uthresh >> 16, which is approximately equal to uthresh0/200
- uthresh = uthresh0 * 0×8000 / 100; ~ uthresh * (1 << 15) / 100
- stereolib basically uses a half of that – in the beginning of the function it transforms the parameter as:
- stereolib transforms it to state→textureThreshold * state→SADWindowSize * state→SADWindowSize * state→preFilterCap / 100;
- So if you call OpenCV function with the scaled textureThreshold and uniquenessThreshold, then the results become very similar visually.
- Numerically they are a bit different, because of different implementation of subpixel interpolation, OpenCV uses script algorithm, while stereolib uses some approximation.
- Still, on Tsukuba only in 12% of pixels the difference exceeds 8 (~half-a-pixel, since the disparity is scaled by 16) and all those pixels are at the object boundaries, where the algorithm is not accurate anyway.
Regarding the processing speed.
- When OpenCV is built without OpenMP support, stereolib is ~15-20% faster (with speckle filtering turned off, since it’s the same code in both cases). I think, stereolib is faster because it does not handle the image border and because it uses faster subpixel interpolation.
- When OpenCV is built with OpenMP support, it is ~40-50% faster than stereolib (with speckle filtering turned off).
- To enable SSE2 (and OpenMP), you can run CMake with flags USE_SSE2=ON and ENABLE_OPENMP=ON.
Action Items
- Vadim to move on to implementing ‘’Stereo processing by semi-global matching and mutual information Hirschmuller’’
Agenda
- Outlets
- Features
- Stefano convergence to a plan
-
GSOC
- Put another way, what would you be willing to mentor? Name your topic.
- Switching OpenCV license from BSD to Apache (see also http://en.wikipedia.org/wiki/Apache_License). The reason is protection from lawsuits:
- If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
- Also known to those “skilled in the art” as “the nuclear option” — you sue someone over something in OpenCV, you lose your license to OpenCV.
- Issue on how to convert?
- If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed.
- Also, there is a copyright issue with externally donated code — we don’t necessarily own the copyright, just the license to use. This would involve setting up an OpenCV foundation and having anyone who contributes give their copyright to the foundation.
- The reason for this is a bit murky — by giving it under BSD they are allowing its use according to license, but they still own the actual writing.
- I don’t know what it means in practice.
Ongoing
- Code coverage
- Calibration/Stereo
-
Plan of record
- {i} When do we call this ’’"closed"’’?
-
Plan of record
- Bugs tracking …
List of wanted additions
- /!\ ’’Stereo processing by semi-global matching and mutual information_Hirschmuller
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- /!\ Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- /!\ New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.
Minutes
- Combine Daisy like idea, or:
- Some similarities with MOPED from CMU, but
Outlets
- Latest version didn’t build. But today it does compile
- Patrick should contact Victor when the node is ready and Victor will make sure the detector works
- Collect new bag files to be sure
- Outlets will be white 2×1
- The outlet detection is 180 degree rotation invariant, or else it is a bug
- Robot should be able to rotate the plug — Victor thinks there should be distinguishable chessboard on both sides.
- When we tested, as measured in the camera frame, X,Y < 1mm error with random short lived blibs to 2mm lighting dependent
- Error to ~1mm if projected to the wall plane,
- Z accurate to +/-5mm in the wall plane
- This suggests that you do purely relative ray fly in using readings in the image plane (chessboard location vs hole location)
- Detect contact when near the real Z
- Features
- Dataset generator
- Works by finding camera extrinsics (PnP) to the points, build a box relative to that
- Look for readme file
- Dataset generator
- Licensing
- Apache # 0 is good for the anti-lawsuit option
- but it not compatible to GPL 1.0 and # 0 (probably), so we might be stuck
- We probably will not change
- Copyrights to code
- When submitting code, turn over the copyright to OpenCV
- But the text is also covered by BSD
- Datafiles can be an exeption
- Apache # 0 is good for the anti-lawsuit option
- Stefano
- Image stitching
- Variation of SIFT
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning
- Yan Cui, Nils Hasler, Thorsten Thorm ?hlen, Hans-Peter Seidel
- ICIAR 2009
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning
- Want a test framework for measuring feature point performance
- Michali Cordelia Schmid affine invariant descriptors
- Apply affine transform (know ground truth)
- Lowe 3D→2D
- Can do with Gazebo
- Want to build this toolbox
- Michali Cordelia Schmid affine invariant descriptors
-
GSOC
- Need mentors
- Recognition toolbox
- Need mentors
- Code coverage went up 1% for calibration … mostly due to the fact that stereo correpsondence tests wasn’t finished yet
- Find homography should have tests (C++ function calls the C) … all the geometric stuff
- Documentation auto-generate and put up
- Add more on C++
Priorities going forward
- Outlets for the next 2 weeks
- Textured object recognition
- Stereo is high priority
- Someone allocated to stereo
- Hirschmuller (Vadim thinks this week for draft version)
- Multi-window version improvement … correlation based
- Hierarchical method
- Graphcuts
- Hirschmuller (Vadim thinks this week for draft version)
- Someone allocated to stereo
- Documentation
- Higher priority than tests now (as per user group poll)
- Texas help
- Color blobs need to be more stable.
- 3D tools — easy to compute PnP, rotate, back project, render into a scene
- We have most of these tools, but they are not in a coherent framework
- Testing
- Stereo correspondence
- All new code goes in with tests
- Incremental improvements
Vadim
- Regression test on MSER has been added Alexey
- Finished big camera calibration test (monocular case for now) that uses artificially generated chessboards for random camera intrinsic parameters Anatoly. Here are our findings:
- camera matrix is estimated reasonably well (the relative error is within ~1-3%) especially when the aspect ratio is fixed and/or the principal point is explicitly centered.
- k1, p1 and p2 distortion coefficients are also estimated rather precisely.
- k2 and k3 estimates are actually terrible. Sometimes they are 2x, 3x … 10x from the actual values, even if the chessboard finder is not invoked but the projected corners are passed directly to the calibration procedure (with very very little noise added). Possible reason is that when most of the board views are somewhere in the image center rather than near the border, those k2 and k3 are multiplied by very small numbers (r^4 and r^6, respectively, where 0<=r< ~0.5-1), so it’s difficult to estimate them precisely.
- translation vectors are estimated pretty accurately.
- rotation vectors are estimated pretty well too when the board is sufficiently close to the camera.
- We can conclude that in order to estimate distortion parameters precisely we probably need to revise the calibration methodology. The calibration pattern should better cover almost the whole frame and there should be many corners in each view, especially near the image border, where the relative weight of k2 and k3 is larger. May be we should try to detect board edges rather than corners and try to find such distortion coefficients so that the lines become straight.
- Reproduced middlebury framework, various quality metrics. Started StereoBM integration and parameter tuning Maria
- Fixed automated building of ROS packages dependent on OpenCV in our buildbot (our scripts broke at some point after the directory structure in ROS was modified) Anatoly.
- OpenCV’s StereoBM implementation was compared in-depth with the latest version in ROS. The found differences have been mostly eliminated by adjusting CvStereoBMState parameters. Vadim
- Some draft implementation of H. Hirschmuller algorithm is ready, started debugging and testing Vadim
Code coverage. calibration dynamic (aka conditions) coverage was improved by 1% this week (from 83 to 84), otherwise, there are no other changes:
http://spreadsheets.google.com/pub?key=tPvCIw1M4fqU9UQCNRTyoLQ&output=html
Misc: Alexey Latyshev is on vacation for the whole this week (Feb 1-7)
Action Items
Gary
- Edit C++ interface tutorial
- (./) Tell Patrick to tell Victor when he’s done modifying the outlet package
- Generate lots of new bag files in different conditions
- Oriented in each direction
- Think of what you want to mentor for GSOC
- (./) Have Suat send code to Victor
- Get auto-generated docs put up to wiki every night
Vadim
- Find out what Stefano means in terms of mentoring
- Probably work in terms of image stitching
- Maybe student can build ground truth test set on natural scenes
- Think of what you want to mentor for GSOC
Victor
- Think of what you want to mentor for GSOC
From last time
Gary
- :( Edit C++ interface tutorial
Vadim
- (./) To Stefano
Victor
Agenda
- Hirschmuller/stereo
- Outlets
- Fast grid detector
- Recognition toolbox
- Need a new dataset for tracking
-
GSOC
- Meet with Jean-Yves about stitching
- Kurt(?)
Ongoing
- Code coverage
- Calibration/Stereo
- Bugs tracking …
List of wanted additions
- Stereo processing by semi-global matching and mutual information_Hirschmuller
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.
- Combine Daisy like idea, or:
- Some similarities with MOPED from CMU, but
Minutes
- Hirschmuller doesn’t work as well in low texture
- Consumes a lot of memory, questionable even on HD to 100’s of megabytes
- Need to process tiles but then loose the semi-global property and so loose context in low texture scenes
- Use Kurt’s differences between box, will speed up algorithm
- Will work better in low texture and large resolution images
- Want to try with Robot’s textured stereo at 640×480
*_Outlets_ - New bag files are about 100% detection
- Companding (bit compression) works better
- Pose priors
- Unstable: angular degrees of freedom hard to estimate except roll (rotation around Z)
- Want to use hints to restrict the angular pitch and yaw in the pose estimation
- Represented by rotation vector … not roll pitch raw
- Angle axis harder to restrict
- Find outlet, run FindExtrinsicParams, Rays will go through outlet holes. Can use ray piercing
- If you know the outlet size, then piercing can give you depth by scaling
- Represented by rotation vector … not roll pitch raw
- Latest code that works with new bags is checked in
- Parameters
- Texture recognition toolbox
- Document, so far internal https://docs.google.com/a/willowgarage.com/Doc?docid=0AW1WrefIjUo-ZGRzeHZxZmRfNWd2a3A3cWZr&hl=en&invite=CPLmi4cF.
- Scalable
- Extremely random trees
- Vocab tress
- Fast NN
- Want to do
- Datasets — real ground truth tracking
- Rendered objects like Lowe
- Large data set of labeled object
- Test engine with all current and growing number of features
- Texture object challenge, announce at OpenCV
- Challenges can be challenging … with resources
- Get an intern to do full time during the challenge
- Datasets — real ground truth tracking
-
GSOC
- Will talk to J-Y about stitching
- Tweet more
Victor
- Outlet detection: analyzed about 5 bag files created by plugin_sprint team. Outlet holes are saturated for the earlier bag files (2010-02-02), causing noisy detection.
- Also gamma parameter had to be changed for these holes to make one way descriptor work correctly.
- Latter bag files (2010-02-08) have much nicer views of an outlet, all holes are very high contrast and can be easily detected (with gamma changed pack and increased scale).
- All 6 datasets have been labeled for easier retrieving of detection statistics (Alexey)
- Started object recognition toolbox document to put together all the parts.
- Conducted experiments with Radu’s stereo pair using block matching and graph cuts, Vadim has taken this over as I got dragged back to outlet detection.
- Found a problem with loading outlet.launch due to an exception in reading outlet template. This is reproduced on exactly one machine, still investigating the root cause.
- Graphics project status: pending contract approval by nVidia
- Benchmark project status: negotiating SOW, tentative start date March 1st.
Vadim
- Hirschmuller algorithm (SGM – semi-global [stereo] matching) has been finished and tested Vadim
- For now the mutual information cost function is not implemented, instead, the simpler Birchfield & Tomasi cost function is used.
- On scenes with a rich texture it works very well, and gives noticeably cleaner and sharper disparity maps, comparing to the block matching algorithm. The current speed is on Tsukuba is 350msec on my # 4GHz Core2Duo laptop (single-thread C code). I believe, with SSE the running time can be reduced downto 80-100msec for single-threaded version on the same hardware – the core loop operates on 16-bit integers; besides there are many min/max operations, for which there are specialized batch SSE instructions. Block matching algorithm runs in ~40msec on the same machine (single-threaded SSE2 code).
- On areas with no texture the algorithm can fail. Moreover, since the algorithm takes a lot of memory ( O(*)), sufficiently large images (like 1024×768 or higher) can only be processed by tiles (i.e. the “semi-global” algorithm does only use the information inside the tile), which should overlap for more robust behaviour. The running time increases because of overlapped tiles, and yet the wrong depth maps can still come out. Using color information helps, but not completely solves the problem.
- Attached are results of the 2 algorithms on Tsukuba (stereo BM was executed with 7×7 and 17×17 block sizes) and the stereo pair that Radu provided. The original 10mp image was too much for both of the algorithms: StereoBM had too many options to choose from (since the blocks are matched independently), and the tiny tiles in SGM just did not include enough context for robust estimation. After reducing the image by factor of 4 on each dimension (=> 16x smaller area) the algorithms started to produce reasonable output (with maximum disparity == 128).
- Currently I’m trying to create some intermediate 1-pass algorithm (SGM is 2-pass algorithm) that will use the same cost function as in Kurt’s algorithm (per-block absolute difference between clipped x-sobel derivatives, with optional color support), but will also use context information from the other blocks on the same line and from the line above (i.e. the first pass from Hirschmuller algorithm). This hybrid algorithm consumes very little memory (slightly more than BM algorithm) and at the same time should not be much worse than the original algorithm. I tried to disable the second pass in the original Hirschmuller algorithm, and on Tsukuba it still produces quite decent results.
- Finished stereo test. All the test data from Middlebury framework is added, StereoBM algorithm has been added into the test. Maria
- Fixed several build problems and other bugs: tickets #74, #91, #101, #10# Anatoly
- Found some more bugs, filed 2 new tickets: #113, #11#
- Code coverage was improved a bit, the new results are available here: http://spreadsheets.google.com/pub?key=tPvCIw1M4fqU9UQCNRTyoLQ&output=html
- Stereo matching results:
- Our raw stereo, 17×17
- Raw color
- Tsukuba 17×17
- Tsukuba 7×7
- Tsukuba color
Action Items
Gary
- Edit C++ interface tutorial
- Get textured light images to Vadim
Vadim
- Get Hirschmuller point clouds to Radu
- Run on textured light
Victor
From last time
Gary
Vadim
- (./) To Stefano
- Dense optical flow
Victor
Agenda
-
GSOC
- Mentors
- Background subtraction, feature based tracking
Nicolas Saunier, Ph.D.
Professeur Adjoint / Assistant Professor
Departement des genies civil, geologique et des mines (CGM)
Ecole Polytechnique de Montreal
http://nicolas.saunier.confins.net
- Implement some well known CV algorithms.
Mark Asbach
Institut fur Nachrichtentechnik
RWTH Aachen
D-52056 Aachen
- Image stitching and/or image collage *
Gary Bradski
Senior Scientist, Willow Garage
Consulting Prof. Stanford U.
OpenCV Founder, Technical Content Owner
- ?
Vadim Pisarevsky
OpenCV founding team/Czar
- ?
Victor Eruhimov
OpenCV founding team/Senior Researcher
Argus/Itseez founder
- Background subtraction, feature based tracking
- Mentors
- Stereo algorithm progress
- Bugs progress … when to call a code freeze to do bug fix and documentation spruce up?
- Plug in reports
- Texture based object recognition
- BG segmentation
- Have myCreateFGDStatModel(), myUpdateBGStatModel(), and myRleaseBGStatModel() and one data structure MyBGStatModel
- Notably, we are missing myBGStatSegment which just does foreground segmentation, not learning and segmentation.
- Have myCreateFGDStatModel(), myUpdateBGStatModel(), and myRleaseBGStatModel() and one data structure MyBGStatModel
Ongoing
- Code coverage
- Calibration/Stereo
- Bugs tracking …
List of wanted additions
CODE FREEZE: March 9th
- Get in GSOC application.
- (./) ’’Stereo processing by semi-global matching and mutual information_Hirschmuller
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.
- Combine Daisy like idea, or:
- Some similarities with MOPED from CMU, but
Minutes
Stereo
- Hirschmuller results when mixed with block matching
- Much less memory use — can handle 10Mpix images no problem
- Works better on lower texture
- On Tsukuba, results are good except for boundaries, haven’t tested exact results yet
- Will get exact comparisons when its integrated into the test framework
- Faster than original algorithm, much faster than graph cuts, but 10x slower than block matching
- Can probably speed up by factor of 4-6x
- Need to create a test set using Willow’s stereo
- Need to finally get beyond the very limited Middlebury stereo
- Maybe use laser pointer to align 3D scanned object with stereo imagery
- Possibly use the “Borg Scanner” at Stanford
Bugs
- When to call a code freeze?
- March 31st release
- So March 9th will start release process
- Bug fix
- More test code
- Documentation advance
- So March 9th will start release process
- March 31st release
GSoC
- Will have to get clear mentor statements
- Gary will work on application, putting up wiki
- SOW should be clear
- Track by email, twitter
Plugging In
- See Victor’s report below
- Plugging in worked well
- Need to generalize the outlet detection
- Have to add in my gradient grid features
- Has to be stable to scale change
- Geo matching is invariant to scaling
- Has to be invariant to affine transform
- Has to be stable to scale change
- Make learning the outlet quick and general
- Far outlet detection
- Door handle
Textured object recognition
- I’m working on new features for that
- Suat
- Look at his code
- Need
- Recognition pipeline
- Sparse
- Grid
- Recognition pipeline
- Start with data
- Specific tasks:
- Sparse textured recognitino
- Transparent
- Textureless
- 3D textured data
- Specific tasks:
- Dataset generator
- Test set generator
Background
- Put it into a 3D capture framework with addition of grabcut
- Generalizing the calibration object
- Security
Vadim
- Some fusion of Hirschmuller and Kurt algorithms (called SGBM – semi-global block matching) has been created Vadim
- On the one hand, SGBM is similar to the Kurt’s algorithm:
- the square blocks centered at each pixel are matched, rather than individual pixels. the sum are computed using sliding windows, so the overall complexity is O(W*H*D), where W is the image width, H – height, D – maximum disparity
- x-sobel is used to emphasize the texture.
- the whole processing is done with a single pass through the image, so the amount of consumed memory is O(W*D), rather than O(W*H*D) in the original Hirschmuller algorithm. Therefore, no tiling mechanism is needed, the whole image is processed within the same loop.
- the best matches are verified using the uniqueness threshold, i.e., they must be significantly better than the second-best matches.
- after the disparity is computed, the speckles are filtered off using the same procedure as in StereoBM.
- On the other hand, SGBM is similar to Hirschmuller’s algorithm:
- dynamic programming is used to find the best disparity for each pixel. Unlike the original algorithm, SGBM is single-pass algorithm, thus it only does the real dynamic programming optimization in each row, and it also tries to minimize the difference between the subsequent rows,
- i.e. it is greedy algorithm in the vertical and diagonal directions.
- the individual pixels are matched using subpixel-accurate Birchfield metrics, rather than just simple absolute difference.
- the reverse image2→image1 disparity map is computed at the same time as the normal one, and then used to find occlusions.
- dynamic programming is used to find the best disparity for each pixel. Unlike the original algorithm, SGBM is single-pass algorithm, thus it only does the real dynamic programming optimization in each row, and it also tries to minimize the difference between the subsequent rows,
- I also added support for color images, that improved disparity accuracy on Tsukuba and Radu stereo pairs.
- The algorithm runs about the order of magnitude slower than StereoBM but much faster than the original algorithm. Also, a significant speedup is possible, the algorithm is very SIMD-friendly.
- On the one hand, SGBM is similar to the Kurt’s algorithm:
- Some initial experiments with Intel TBB have been done, StereoBM has been rewritten using TBB Anatoly
- These are running times on different stereo pairs:
stereo pair | OpenMP (sec) | TBB (sec) |
1 | 2.81219 | 2.80764 |
2 | 4.71362 | 4.85106 |
3 | 4.38655 | 4.13099 |
4 | 2.6288 | 2.68929 |
5 | 4.0809 | 4.53117 |
6 | 1.95581 | 1.80189 |
7 | 3.5965 | 3.56124 |
- As you can see, except for the pair #5 TBB is on-par or even a bit faster than OpenMP.
Victor
-
Outlet detection:
- tested on several bags, plugin_sprint needed detection for frontal views and it worked fine after an increase in scale search region,
- 100% accuracy, 0 false alarm rate.
- A method for tracking parameters (for speeding up scale search by narrowing the search region after a successful detection) based on Kalman filter has been implemented (Alexey).
- Several experiments on a possibility of eliminating one way descriptor (that is sensitive to camera parameters and [somewhat] to lighting conditions) have been conducted.
- Matching unclassified keypoints causes several false positives that can be eliminated by additional information (edges, regions).
- Right now a geometric matching method that detects a set of keypoints together with a set of uniform regions have been implemented and is being tested.
*_Plug detection:_
- Right now a geometric matching method that detects a set of keypoints together with a set of uniform regions have been implemented and is being tested.
- tested on multiple bag files collected during plugin_sprint,
- fixed a bug and implemented an improvement for low resolution chessboard detector.
- The latest version works for all bags but two where some of the black squares are not detected due to low lighting.
- This can be fixed but I suggest to change the pattern from chessboard to a set of circles for more robust detection.
Action Items
Gary
- GSOC application
- Stereo textured images to Vadim
- Find out about making ground truth set. Maybe talk Morgan at Stanford to get his scanner for making ground truth stereo
- Get Suat’s code to Victor
- Write recognition framework
- (./) Get texas access for Argus
- Scott doesn’t want to invest or maintain the old system while working on the new, so he’d rather they not be used much at Willow until the new ones are created.
Vadim
- See if we can get the improved Background subtraction code into opencv
- Try Hirschmuller on textured light stereo images (try pure Hirschmuller and the modified variant).
- Return point clouds for Rudu to look at
Victor
- Look over Suat’s code
From last time
Gary
- :( Edit C++ interface tutorial
- :( Get textured light images to Vadim
Vadim
- :( Get Hirschmuller point clouds to Radu
- :( Run on textured light
Victor
Agenda
- Stereo issues with Kurt
- Calibration GUI tool
- Sparse Pose and Sparse Bundle Adjustment (SPA & SBA) — Kurts algorithm in
- Object Recognition Stack
- Textured objecs
- Scalable recognition
- Geometric confirm.
- Bugs progress … when to call a code freeze to do bug fix and documentation spruce up?
- Plug in, might need minor adjustments, but seem OK for now
- BG segmentation
- Have myCreateFGDStatModel(), myUpdateBGStatModel(), and myRleaseBGStatModel() and one data structure MyBGStatModel
- Notably, we are missing myBGStatSegment which just does foreground segmentation, not learning and segmentation.
- Have myCreateFGDStatModel(), myUpdateBGStatModel(), and myRleaseBGStatModel() and one data structure MyBGStatModel
Resource Allocation
- ObjectRecognitionStack I’ve created a wiki page: http://pr.willowgarage.com/wiki/ObjectRecognitionStack, This probably heavily involves Victor on the texture recognition and pose toolbox and will pretty much start now. Victor, Me, Suat (starting in April)
- I’m hoping/assuming that the outdetector stuff is table, or is rebuilt as a subset of the texture recognition and pose toolbox.
- Kurt wants someone to interact (aka deliver) ports of his sparse bundle adjustment and sparse pose adjustment algorithms. Yes we have some sort of bundle adjustment — Kurt’s stuff probably runs >10x faster. This is the core of visual odemetry. Who?
- Stereo — running the textured light 640×480 on pure Hirschmuller, want to see point clouds. Running BM, BM+Hirschmuller and Hirschmuller on the 1Mpixel images from Kurt. Return point cloud. After this, I suppose test code for Hirschmuller and integrate. Vadim
- Probably we want a full calibration GUI that goes on up through stereo and rectification.
- Texture grid — solving problems like doorhandles, far outlets etc. Gary
- Code freeze March 9, OpenCV v# 5 March 31st. [Will need bug clean, more doc, more test,All, but who is main?]
Ongoing
- Code coverage
- Calibration/Stereo
- Bugs tracking …
List of wanted additions
CODE FREEZE: March 9th
- Get in GSOC application, post it http://opencv.willowgarage.com/wiki/GSOC_OpenCV2010.
- (./) ’’Stereo processing by semi-global matching and mutual information_Hirschmuller
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.
- Combine Daisy like idea, or:
- Some similarities with MOPED from CMU, but
Minutes
Stereo
- Hirschmuller is faster
- Seems better on lower texture, you get Real Hirschmuller > modified Hirchmuller > BM
- Need to create the inscribed rectangle where there are only valid disparities between the left and right camera. The statitics for this rectangle can be kept throughout the process.
- Top you find the lowest non-valid pixel and that bounds the top.
- In Kurt’s code, these are dtop, dbottom,
- Can we put the left-right check from Hirschmuller to BM … but hurts efficiency.
- Kurt will get a bag … want to study L-R algorithm on it
SBA
- SBA is the back end bundle adjuster
- Need a front end match “proposer”.
- Patrick has Nistor trees
- Need sub-class detector/descriptor
- Even more speedup with more careful dataflow analysis. Semi-Hirschmuller
Detector/Descriptor
- Need to get a design draft of how to put in our detectors/descriptors
- Good sub-matcher
- Then can put into nodelettes style as per PCL (Point Cloud Library).
Outlet
- Plane Optimizer
- Victor made this, but can’t get pose from TF due to timestamp (?) problem
- Frame should be published in the bag file, but Victor isn’t seeing it
- Victor made this, but can’t get pose from TF due to timestamp (?) problem
- Now we’re going to plug in sideways to shorten the outlet cord
Resources
- Maria returns March 2nd, spend on release
- Kurt’s task, fusing with descriptors is within Victor’s
- Optimization person is in play (Alexi S.) … Can Willow get one more
- Sparse Bundle Adjustment delay integration post release?
- Graphics — approved contract today. If so, start March 1st. People are in place to switch over so this is OK
-
CPU — SOW is approved. Tentative date is March 1st. Probably delayed to later in March.
- CPU legal put in boilerplate “closed” license. Needs to be changed.
Background Subtraction
- Vadim look over to see if it’s easy to put in foreground only code (without causing the algorihtm to continue learning) into the Rainer BG framework.
Vadim
- semi-Hischmuller algorithm has been optimized by more than the order of magnitude (!) Computing correspondence for the stereo pair from Radu took # 5s in the previous version; now it takes ~350msec (640×480) on the same laptop. Some further speed improvement is certainly possible, because the algorithm does not do very much extra work comparing to the block matching algorithm. I expect that the single-threaded version can probably process the Radu stereo pair in ~100ms or even faster. BM algorithm process the stereo pair in ~40ms. Vadim
- Performance of the OpenCV implementation of the BM algorithm from stereolib was improved by ~20% and now it is as fast as the original sterelib code (in a single core; on dual- or quad-core machines it’s significantly faster) Vadim
- Some first experiments with the new images from Radu & Kurt have been done. The results are surprisingly bad :(, both with BM and semi-Hirschmuller algorithm. Another surprise is that on the pictures without textured light the results look better! The results are attached. The possible reason is too severe rectification applied that distorts the proportions. The green rectangle is the area where disparity should be valid (by the request from Kurt). Vadim
- Integration of TBB continues. Several improvements in TBB-ed StereoBM have been done, so that no repeated memory allocations are invoked during the processing. The gap between OpenMP and TBB is now much smaller, although the OpenMP version still adapts better to the number of CPUs Anatoly
- 3 machine learning-related trac tickets (#56(SVM), #64(SVM), #129(Boost)) have been closed Maria
Example of new stereo results on Mpix camera (no projected texture):
- Old Block Matching Method very fast, fraction of a second
- New, semi-Hirschmuller results.
- Note that it is better but probably ~0.5 sec
- We should probably think of publishing this — it’s a tradeoff between extremely fast and memory efficient BM and very slow and huge memory hog Hirschmuller (GB per pair?). Semi-Hirschmuller is memory efficient, pretty fast and certainly better than BM in results.
- Misc: Maria is on vacation from last Tuesday till March 2nd.
- Plans: finish TBB integration, try real Hirschmuller on Radu data, start preparations to the code freeze
Victor
Outlet detection:
- Added a region-based filtering to geomertical model of an outlet. A set of points is sampled randomly from the largest uniform region in the training image and then mapped into the test image using the hypothesized affine transform. Deviation of intensities on this point set is used as a filtering criterion. This still does not allow to get rid of one way descriptor, but allows higher misclassification rate of keypoints with small overhead.
- Pose estimation: A function for estimating a pose of a planar object with a given object normal is implemented. This problem is solved in closed form by projecting rays into target plane and finding the distance to the plane by minimizing RMS error. The function still has to be tested on real bag files, but I still haven’t succeed in getting outlet priors working.
- Object recognition: Started to prototype a generic interface for descriptors. The nearest goal is to make several descriptors implemented in OpenCV interchangeable on outlet detection problem.
Action Items
Gary
- GSOC application
- C++ tutorial
- C++ docs still don’t read functions
- Get new plug bag files to Victor
Vadim
- Test L-R constraints on Kurt’s stereo images
- Re-run Hirschmuller, semi-Hirschmuller and BM on textured 640×480 stereo images that have been properly rectified.
- Look over Rainer’s Background Subtraction code to see if it’s easy to allow foreground segmentation only separate from learning. If so, put it in, if not, we’ll delay this functionality.
Victor
- Look over Suat’s code
- Work with Patrick/Radu to architect interest point detector/descriptors
Radu
- Suat’s code to Victor and Patrick.
- Work with Patrick/Victor for deciding how interest point detectors/descriptors should be sub-classed (especially to work easily with PCL).
Patrick
- Work with Victor/Radu to architect interest point detector/descriptors
Kurt
- Send Vadim bag/images to test L-R constraints
James
- Can we get the C++ functions to be searchable in the Wiki … otherwise we’re going to have to do some Python code spelunking ourselves.
From last time
Gary
- :\ GSOC application
- Stereo textured images to Vadim
- :\ Find out about making ground truth set. Maybe talk Morgan at Stanford to get his scanner for making ground truth stereo
- Get Suat’s code to Victor
- Write recognition framework
- (./) Get texas access for Argus
- {X} Scott doesn’t want to invest or maintain the old system while working on the new, so he’d rather they not be used much at Willow until the new ones are created.
Vadim
- See if we can get the improved Background subtraction code into opencv
- Try Hirschmuller on textured light stereo images (try pure Hirschmuller and the modified variant).
- Return point clouds for Rudu to look at
Victor
- Look over Suat’s code
Agenda
- GSOC application due by the 8th …, need to fill in more exactly what people want to mentor.
- March 31st release
- What do we want in/out?
- Report on the stupid C++ wiki lower case function name search bug
- We had a visit with Ajay Mishra, ‘’Active Segmentation with Fixation’’. We (Willow) might try doing a joint project together to be defined, but perhaps pulling items out of clutter.
- Plug in reports
- What goes into OpenCV from this?
- Texture recognition toolbox
-
SOW work
- GPU
- CPU
Ongoing
- Code coverage
- Calibration/Stereo
- Bugs tracking …
List of wanted additions
CODE FREEZE: March 9th
- Get in GSOC application.
- (./) ’’Stereo processing by semi-global matching and mutual information_Hirschmuller
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.
- Combine Daisy like idea, or:
- Some similarities with MOPED from CMU, but
Minutes
- (James)Python bindings
- There are too many python bindings (6?) … need to unify. Start over?
- Swig covers more but isn’t as stable
- Current bindings are good, but don’t support C++
- machine learning is not supported
- HOG, Calandar not covered
- Could make a C layer to wrap C++ (which sometimes wraps the C) to make Python easier
- Sample code with current wrappers requires explicit allocation
- It would be nice to get rid of this
- James will look today at the C++ interface
- Numpy array interface looks like its working fine
- Can use numpy arrays as if they were cvmat
- cvmat can be missaligned
- Would be good to have numpy as src and dst
- Eigen
- Support can be added
- 2 weeks eigen
- 2 weeks numpy
- Victor strongly for release
- Vadim right after release, can’t make large changes
-
Kurt Region of interest now fixed for stereo Vadim
- Need to call this explicitly
- Kurt thinks this should be the default
- Will need to change way stereo is called
- Hirschmuller looks good
- What happens in the L-R check. Doesn’t impose 2x cost
- Not commited yet
- Send message Vadim
*_Plug in_
- New normal bias in estimation in
- Noise is 2x lower ~0.6mm
- Still want to put it in, see if the bias per socket still needed
- Closed form solution
- Minimize re-porjection in object
-
Kurt but do you know the covairance in object plane .
- Might be a matter of just rescaling in Z
- Covariance in image plane needs to be reporjected into socket plane
- Why not use standard numeric optimization
- Worry about the covariance
- Coordinate with Wim and Patrick
- Need to work with the trunk version of vision_opencv
- Might need to branch
*_Features_
- Might need to branch
- Design of API
- Need to converge design
- Haven’t seen Suat’s code
- Need consistent API
-
MLL as example, tried to unify but failed
- handle missing inputs on decision trees
- Scalable
- Approx NN
- extremely random trees
- naive bayes
- vocabulary tree [Patrick did already]
*_GSOC_
- Vadim
*
- porting OpenCV algorithms from OpenMP to TBB, adding threading to some new algorithms, such as planar object trackers (Calonder, One-way).
- extending HighGUI: more advanced image visualization (pixel value, zoom, loupe, remembering window positions and sizes, “save as” …), separate control panel with buttons, opengl support.
- camera calibration GUI and more advanced calibration algorithms (with more accurate distortion parameter estimation)
- building 3D shape out of silhouettes
*- Victor A project around object recognition
- Gary: Image stitching and/or collage
*_Release_ - Gary talk to Steve
- Vadim estimate timing
- Hirschmuller
- Gary run test code
- Put description in the docs Vadim
- Paper for ECCV?
-
ECCV
- Normal priors closed form solution
- Stereo
*_SOW_ - Sanjay — seems clear
- All the code belongs to the corporation
- But not a standard
- NNU applied for scalable machine learning grant (Latent SVM will get in)
- VP to sign
- Host sandybridge at Willow
- Andy — new SOW
- GPU … can we make a demo? BM stereo. Then
- Projector, get regular one. Gary ask Kurt about opencv team a projector.
Vadim
- extra functions cv::getValidDisparityROI and cv::validateDisparity (together with the C counterparts, cvGetValidDisparityROI and cvValidateDisparity) have been added, by the request from Kurt. The first function returns the ROI in the computed disparity map that has been computed using only valid pixels of the rectified images. the example pictures have been sent on Saturday. The second function performs left-right consistency check and can be used to reject some incorrectly computed disparities. The algorithm is taken from the Hisrchmuller paper and it requires the core stereo algorithm to store the pixel/block matching costs Vadim
- The original Hirschmuller algorithm (running on blocks, with 1×1 blocks being a partial case) has been implemented and tried on the “a chair at WG office” and Tsukuba stereo pair. The point clouds for the first pair have been sent. The computed disparity looks very-very close to the approximate Hirschmuller, and yet the performance is 2x lower at least. Since it’s easy to switch between the two variants, by setting StereoSGBM::fullDP to true/false, both of the variants are available, but it looks like there is no reason not to prefer the original algorithm, at least on our typical data Vadim
- Gaussian mixture-based background/foreground algorithm in cvaux, based on the paper by KaewTraKulPong P and Bowden R, “An Improved Adaptive Background Mixture Model for Real-time Tracking with Shadow Detection”, has been rewritten:
- C++ interface has been provided in addition to the C interface
- the optional learningRate parameter has been added that controls the model update speed. When it is 0, the model is not updated at all; when it is negative, default update speed is used, otherwise the specified value is used.
- performance has been improved a lot.
Vadim - The status of TBB integration: OpenCV build system has been extended to detect TBB on various platforms, but that appears to be non-trivial. TBB source packages do not include an installation script or pkg-config script. But we found that Ubuntu 10.4 ships with pre-packaged TBB # 2 with the pkg-config script. These packages installed on Ubuntu 9.10 and work flawlessly. Ubuntu 9.10 comes with TBB # 1 that has some bugs critical for us. So we added TBB detection script to OpenCV that uses environment variables on Windows and Mac, and pkg-config on Linux. In addition, we check that TBB version is >= # 2 Anatoly
- Two contributed patches have been applied: >2gb memory support in highgui from our long-term user Shiqi Yu, and fullscreen highgui windows capability from Yannick Verdie.
- Misc: Maria is back from vacation and will continue testing of the stereo algorithms, including the new Hischmuller.
Victor
- Outlet detection:
- tested on yet another plugin pose (from aside). The detection worked fine after adjusting the scale search space and gamma parameter.
- In future I plan to make a search over different gamma parameters and then reduce the search space for both scale and gamma over time with a kalman filter.
- A new version of planar pose estimation with known plane normal has been implemented. The new algorithm solves the two-dimensional optimization problem (finding the distance to the object plane and rotation angle inside the plane) in closed-form, minimizing the reprojection error in the object plane (not the object plane — a closed-form solution would not be applicable there).
- The standard deviation on bag files with non-moving forearm camera facing an outlet is two times smaller than that of the current method.
- The work on outlet detection test is in progress Alexey.
Action Items
Gary
- Get GSOC application in
- Get Suat’s code to Victor
- (./) Talk to Steve about release extension
- ‘’Stick to the March 31 deadline, call in release # # Then do # # If the incremental difference is not great, then the user adaptation problems won’t be as large either.’’
- See if we can get a textured light projector to the OpenCV team
- Wiki doc but on C++ lower case functions.
Vadim
- Send Kurt message when stereo changes are checked in
- Stereo example code to Gary. We want to run: BM, Hirschmuller and “BMH”: Block matching Hirschmuller.
- It should report run time and design some metrics for quality.
- Get data for Middlebury has ground truth,
- Willow stereo can at least have number of disparities found for quality. Don’t know what else other than … use of point clouds for recognition?
Victor
- Write up outlet pose for ECCV
James
- Look over how to make numpy, PIL and cvmat all usable in the python framework.
From last time
Gary
- :( GSOC application
- Stereo textured images to Vadim
- Find out about making ground truth set. Maybe talk Morgan at Stanford to get his scanner for making ground truth stereo
- Get Suat’s code to Victor
- Write recognition framework
- (./) Get texas access for Argus
- Scott doesn’t want to invest or maintain the old system while working on the new, so he’d rather they not be used much at Willow until the new ones are created.
Vadim
- (./) See if we can get the improved Background subtraction code into opencv
- (./) Try Hirschmuller on textured light stereo images (try pure Hirschmuller and the modified variant).
- Return point clouds for Rudu to look at
Victor
- Look over Suat’s code
Agenda
- GSOC application
- March 31st release
- CODE FREEZE
- C++ wiki lower case function name search bug still there
- Bug close out
- Test function focus
- Documentation priorities
- Background subtraction
- C++ Examples
- New stereo
- Python
- MLL usage
- Samples
- Data collection utilities
- Use BG+segmentation
- Boxes relative to a chessboard
- Data collection utilities
- Texture recognition toolbox
-
SOW work
- GPU
- CPU
Ongoing
- Code coverage
- Calibration/Stereo
- Bugs tracking …
List of wanted additions
CODE FREEZE: March 9th
- (./) Get in GSOC application.
- (./) ’’Stereo processing by semi-global matching and mutual information_Hirschmuller
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.
Minutes
- Combine Daisy like idea, or:
- Some similarities with MOPED from CMU, but
-
GSOC 2010
- Victor
Assessment of OpenCV object recognition toolbox on several different problems:
a) PASCAL 2009
b) Caltech 101
c) The ETH Zurich shape database
During the first phase, sample algorithms will be created to assess OpenCV capabilities in these recognition problems. Input/output engines will be implemented if necessary, one or several recognition algorithms will be implemented and compared to the state of the art results. The convenience of the API will be also assessed. During the second phase, the feedback will be used to improve object recognition toolbox both in terms of algorithms and API. The project will show what someone can do with a recognition problem having just OpenCV, provide useful feedback for the object recognition toolbox and produce a set of recognition samples.
- Vadim
- HighGUI
- Gary
- Image stitching
- Gary will finish the application on Wednesday, April 10
-
Stereo
- Boundaries for speckle detector — ‘’probably this is now fixed_and same as in Kurt’s original code
- These were put in so that processing doesn’t exceed the image boundaries
- Can set ROI image1 and ROI image2 to get maximal image in stereoBM states. If not set, the function will compute all pixels except
- Vadim will change this to be the default
- Now have sample code so that you can do Hirschmuller, BM, semi-Hirschmuller
- This is all running on Middlebury Maria
- Now take in account processing effects on disparity window, so there will be no border effects even without the speckle filter
- Boundaries for speckle detector — ‘’probably this is now fixed_and same as in Kurt’s original code
-
Features
- Calandor, 1 way and Ferns. Calander dense (PCA version) is a regular vector
- Include the descriptor with matcher class
- IPL has to
- How does Gary’d “patchlettes” fit in??
-
Plugs and Outlets
- Seems to be working right now on outlet wall and in simulation
- Offset calibrates robot calibration — think this is robot arm dependent
-
ECCV
- Middleburry
- Data
- Hirschmuller used xsobel
- Sparse bundle adjustment and Sparse pose adjustment — discuss getting this in post release
- Versioning
- Sent James and Brian messages about branch and fixes
- CC me
-
Code freeze
- C++ wiki search bug on all lower case function
- Work down bug lists
- Add tests for new functions
- Documentation
- Background subtraction
- Data collection untilities
- Available as a sample code
-
New branches
- cvpatented branch
- user branch
- subdivide these into small modules that can build separately Motion, 3D, …
- Look at Willow’s build bot to kick back “doesn’t build email”
- clean out of unsupported and not building packages
- move to graveyard
Vadim
- OpenCV can now fully supports 64-bit MacOSX. Carbon UI has been replaced with Cocoa (thanks to Andre Cohen_) and Quicktime backend for Video I/O has been replaced with QTKit (thanks to Nicholas Butko_). The code is in SVN [Vadim
- The long-standing problem with old and new CPU support has been mostly solved (the remaining pieces of code are been adjusted now).
- In OpenCV # 0 SSE2, SSE3 etc. support could only be enabled or disabled at compile stage, so the code was either slow (without SSE instructions) or crashed on some old hardware.
- Now we call CPUID in the beginning and determine which instructions CPU supports. Then, the optimal code branch is selected at runtime. Thanks to this feature, the runtime dispatching, we can safely use new_Nehalem instructions_and even_AVX_and ’’do not need to build a special version_for the new CPUs.
- The code will be committed after a bit more testing Vadim
- 64-bit Windows support is being improved.
- Our buildbot now includes Win64 build.
- Visual Studio builds the code successfully, except for ffmpeg interface module (opencv-ffmpeg), because of missing ffmpeg binaries.
- Mingw builds core libraries well (after a few fixes), but highgui currently fails to build. The problem is being investigated. Alexander
- Stereo SGBM has been integrated into our test framework Maria
- stereo_match.cpp, [introduced by_*Victor*_
], has been extended to run stereo correspondence on already rectified images, to run the modified SGBM or full Hirschmuller SGBM, in addition to block matching. The sample is in SVN Vadim
- Misc: Starting from the prev. week, Alexander Shishkov, Nizhniy Novgorod State Univ. postgraduate student has joined the team instead of Anatoly Baksheev, who will work on improving OpenCV stereo under NVidia GPU acceleration umbrella project :) Welcome, Alexander and thank you very much for your work, Anatoly!
Victor
*_Outlet detection:_
- no reports on outlet misdetections. One report on chessboard misdetection for gazebo (simulated) data, the detector was filtering out a chessboard with narrow white border. The problem was fixed by increasing the white border.
- Outlet detection code is being refactored. The first pass has been completed, the new wrapper interface is used for one way descriptor, a lot of code was pruned out.
- The current version has indicated an instability in the geometric matching engine that is being fixed. The second pass on the interface will be done afterwards.
- A regression ros test for outlet detection has been implemented. It runs the outlet node on several bag files, stores the poses and compares them with older results (the number of poses and the contents are compared). Alexey
Action Items
Gary
- Run my patchlettes detector on the far outlet data
- Define user contrib processes
- Define how to create cvpatentend directory
- Help fill out C++ documentation
Vadim
- (./) Set the stereo defaults to return only valid disparity processing areas
- Write up a description of semi-Hirschmuller
- Help fill out C++ documentation
Maria
- Get in Middlebury data
Victor
- Set up GSOC 2010 mailing list
- (./) Send James and Brian, CC’Gary a ping on the patch version of OpenCV
- Send email about dataset labeler … Alexy check if it still work
From last time
Gary
- (./) Get GSOC application in
- (./) Get Suat’s code to Victor
- (./) Talk to Steve about release extension
- (./) ‘’Stick to the March 31 deadline, call in release # # Then do # # If the incremental difference is not great, then the user adaptation problems won’t be as large either.’’
- :( See if we can get a textured light projector to the OpenCV team
- :( Wiki doc but on C++ lower case functions.
Vadim
- (./) Send Kurt message when stereo changes are checked in
- (./) Stereo example code to Gary. We want to run: BM, Hirschmuller and “BMH”: Block matching Hirschmuller.
- It should report run time and design some metrics for quality.
- Get data for Middlebury has ground truth,
- Willow stereo can at least have number of disparities found for quality. Don’t know what else other than … use of point clouds for recognition?
Victor
- Write up outlet pose for ECCV
James
- Look over how to make numpy, PIL and cvmat all usable in the python framework.
Agenda
- -GSOC application-_’’Done__
- OpenCV GSOC 2010
Fill out your project pages.
- Python
- The Python bindings are improving rapidly, so would it be possible to include a Windows OpenCv # something Beta/Experimental release as part of the online download packages? I expect that many of us work in both Linux and Windows, but lack readily available build capability in Windows, and therefore can’t otherwise test OpenCv python programs on Windows using the improvements from SVN trunk. Furthermore, as the Python bindings had some limitations in the major # 0 release (no samples, etc), to encourage Python use I suggest a formal release of # something and its more usable Python bindings in the not too distant future.
- March 31st release
- Patented code directory with SIFT.
- User contrib
- Need to define process
- need to add directory.
- CODE FREEZE
- C++ wiki lower case function name search bug still there
- Bug close out
- Test function updates
- Documentation progress
- Samples
- Data collection utilities
- Use BG+segmentation
- Boxes relative to a chessboard
- Data collection utilities
- Texture recognition toolbox
-
SOW work
- GPU
- CPU
Ongoing
- Code coverage
- Calibration/Stereo
- Bugs tracking …
List of wanted additions
CODE FREEZE: March 9th
- (./) Get in GSOC application.
- (./) ’’Stereo processing by semi-global matching and mutual information_Hirschmuller
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.
Minutes
- Combine Daisy like idea, or:
- Some similarities with MOPED from CMU, but
-
GSOC:
- People have pretty much updated their individual projects. We just wait for word from Google.
- Python bindings for windows will be built and released with # # To do this with SVN, we’ll figure out a way to put in pre-built bindings.
- Release
- Lots of tickest being fixed
- Bugs and docs
- # 1 will go out, when people find problems, # 1.1 …
- Code contributions
- Doxygen (?)
- OpenCV development mailing list someone is trying to do this
- Doxygen (?)
- Might have to update our test system to work easily with cpp.
- Or use ctest or other
- No global places to put, but it all goes to the same binary.
- So if one test fails, the whole thing crashes
-
GPU-CPU
- Need to organize memory acces.
Vadim
Last week we started preparations to OpenCV # 1 release. Mostly we spent time cleaning bug tracker and fixing various compile problems.
- ticket #155 closed (bug in Haartraining application, reported by R. Lienhart student Christian Ries) Maria
- closed tickets ## 49, 66, 76, 77, 115, 140, 163, 164, 165, 170, 171, 174, 176 Vadim
- stereo test has been fixed to reflect the latest changes in StereoBM (by default, the border is not processed). SGBM (2 variants) and BM have been tested on the standard middlebury images. As the result, we now get 68, 69 and 77th places, respectively. Too low for the paper yet Maria
- quality of SGBM has been slightly improved by taking into account pixel brightness component in addition to X-Sobel. The images have been sent in a separate mail. Vadim
- OpenCV now builds successfully on Win64 (except for the optional opencv-ffmpeg component).
- Looks like OpenCV # 1 will be the first OpenCV version that is tested and pre-packaged for 64-bit Windows [Alexander]
- Samples for fern-based object detector and Calonder descriptor added to samples/c Alexey
Plans for the next week: close more tickets, fix and extend documentation, test installation and source packages packages on various platforms.
Victor
- Outlet detection:
- Implemented a geometric model that combines points, edges and regions matching. The motivation for this model was to lower the requirements for (or, ultimately, get rid of) keypoints classifier. We already had bits and pieces of this model and the job was to put them all together.
- A hypothesis on affine transformation is used to project template keypoints, edges and regions to a test image. Edges are matched using simple chamfer matching distance. The template image is segmented into several regions with low variation of intensity, and several points are sampled randomly within the largest region.
- These points are projected into the test image and standard deviation of intensities in these points is compared with the corresponding standard deviation in the template image.
- First experiments show that the model is capable of finding outlets in challenging datasets where the current method (based on one way descriptor) had problems (for instance, 2010-02-02-14-35-01.bag).
- The experiments were done without any keypoint classifier, the outlet is found using geometry only. Because the hypotheses are produced by geometric hashing algorithm, the processing time is 20ms per frame (15 for keypoint detection, 5 for matching) which is a huge difference with the current method that does search over scale and needs 1s per frame for robust detection. More experiments are needed to analyze the method robustness on different datasets.
- The one way descriptor method has been improved by implementing a search over several values of gamma parameter. This makes the algorithm work robustly on datasets with different cameras, camera settings and lighting conditions.
- The outlet detection code has been refactored, lots of unused code has been removed. Another pass of refactoring is needed for outlet_model.cpp — implementation of hole extraction.
- Implemented a geometric model that combines points, edges and regions matching. The motivation for this model was to lower the requirements for (or, ultimately, get rid of) keypoints classifier. We already had bits and pieces of this model and the job was to put them all together.
- Plug detection:
- another problem with checkerboard reported by Patrick has been fixed. Features detected on the border of the checkerboard caused filtering of the detection. I have implemented a workaround, but in the longer term the low-res checkerboard detection has to be rewritten with a better geometric model (such as geometric hashing).
- Generic interface for descriptors has been prototyped. Outlet detection is already using the new wrapper interface for one way descriptor. Implementation of a wrapper for SURF is on the way.
-
GPU stereo BM implementation studied in detail and several initial optimization attempts were done. Most of them are parameters tuning for a specific card (GT220). Also we experimented with the registers count used by nvcc compiler. The tuned version is about 25% faster on the largest resolution (from 1.91sec to 1.38sec on # 7Mp stereo pair, see attached image).
- Current streaming multiprocessors occupancy is 1.0, global memory reading is coalesced (as in the initial implementation). Profiler shows # 19 GB/s total memory throughput, which is several times lower than theoretical bandwidth limit (10.7 GB/s). Currently we believe that the limiting factor is global memory latency (images fetching, SSD and disparity arrays accessing). Now we need to redesign the algorithm to gain further improvement.
- OpenCV stereo tests adopted for GPU. Now we can measure accuracy of stereo correspondence for data with given ground truth.
- NVidia NEXUS installed, studied and tested on several sample projects and GPU stereo BM implementation. But this tool is mostly debug-oriented, so it is not important for us at the current stage of the project.
Action Items
Gary
- Define code contrib. process
- Pros and cons of user documentation
Vadim
- Define code test system for user contrib
- Define “other” directories for patented and contrib code
Victor
James
From last time
Gary
- Run my patchlettes detector on the far outlet data
- (./) Update my GSOC project page.
- :( Define user contrib processes
- (./) Get in GSOC application
- Define how to create cvpatentend directory
- Help fill out C++ documentation
- Define user contrib. process
Vadim
- (./) Set the stereo defaults to return only valid disparity processing areas
- (./) Write up a description of semi-Hirschmuller
- Help fill out C++ documentation
- (./) Update GSOC project page
Maria
- (./) Get in Middlebury data
Victor
- (./) Set up GSOC 2010 mailing list
- (./) Send James and Brian, CC’Gary a ping on the patch version of OpenCV
- Send email about dataset labeler … Alexy check if it still work
- (./) Update GSOC project page
Agenda
- Large format stereo
- Release update
- User contribution
- New directory
- CV Patented/cv_nonfree
- Directory that user can download if wanted. Specifically: perhaps just binaries that can be used to test against SIFT?
- Test system
- Code coverage
- Calibration/Stereo
- Bugs tracking …
- User contribution
- GSOC intern updates — go through the list
Next
- Release: March 31st
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.* Stereo
- Combine Daisy like idea, or:
- A good overall reference to stereo algorithms is:
- http://www.vision.deis.unibo.it/smatt/seminar_stereo_vision.html
- Bilateral filter followed by a local smoothing method:
- http://www.vision.deis.unibo.it/smatt/lc_stereo.htm
- Some similarities with MOPED from CMU, but
Minutes
- Large format stereo
- Large format stereo has problems scaling (local minimum in matches)
- NASA and U. of Delaware have experience, should talk to them
- Calibration can be a problem, window size has to flex
- Higher resolution also tends to mean smaller pixels and at that lower scale, you can get less contrast per pixel
- Stereo doesn’t tend to scale.
- What works at 320×240 can break down at 640×480 and up. Probably because:
- Calibration can break down
- Contrast between pixels get less
- What works at 320×240 can break down at 640×480 and up. Probably because:
-
NASA
- Really nice cameras
- They have texture (planet surfaces)
- Their shots are pretty much frontal-parallel
- It might be interesting to look at other algorithms that are more local. A good overall reference to stereo algorithms is:
- http://www.vision.deis.unibo.it/smatt/seminar_stereo_vision.html
- I’ve been impressed by the bilateral filter followed by a local smoothing method:
- http://www.vision.deis.unibo.it/smatt/lc_stereo.htm
-
Plan
- Finish up plans with Hirschmuller
- Careful calibration tests.
- Check if calibration issue — Just take center 640×480. Get the sub-window well calibrated. Will tell us if the calibration is the issue.
- Closed aperture vs open
- Use chessboards with more corners Vadim
- Will have to change distortion model Kurt
- Local variation in the lenses might have to be accounted for
- This would involve local look-up functions
- Check if calibration issue — Just take center 640×480. Get the sub-window well calibrated. Will tell us if the calibration is the issue.
- Get texture into the scene
- ~HD resolution is higher priority Gary
- _Gather data:
- Calibration pattern, laser line scan, actual object with and without textured light (can release this data)_Gary
- Color can help alignment, can we test on color Vadim
- Kurt stereo comments from email
Here my assessment of some problems
and the way we should go:
- I don’t expect to be able to extract stereo from scenes to the same
degree that we can with human eyes, without adding texture. Partly
this is hardware (sensitivity, S/N), and partly software (not using
all available cues such as edges, shadows, highlights, global
constraints). The recent tests with the 35mm cameras seem to confirm
this. Using the SI cameras won’t change this. We’ll still need to
project texture.
- We don’t need a lot more pixels on target. HD size (1920×1080) with
a wide-angle lens seems like it would be fine for getting good dense
depth data.
- Global shutter would be much better than rolling shutter, especially
if we’re projecting texture. That’s why I’m not happy with the SI
(Altasens) or Sony sensors.
So how should we proceed? Here are my suggestions.
- Get a good megapixel stereo setup with texture projection running.
I have such a setup in my office, global shutter with 1.3MP
high-sensitivity sensors, ethernet interface. I’ve asked Blaise to
hook up the drivers in Linux and a projector board to trigger them,
and he’s going to work on it this week. I already got some images to
Vadim and he processed them, and they look pretty good. This
prototype will help move us forward on algorithms and evaluation of
whether we can achieve the 3D results we need. - Move to IR projection and a 3-head design. Two imagers just look
for texture in the IR band with filters, one is color for registered
visual texture. The current imager I’m targeting is the CMOSIS
CM2000, which has pretty good S/N in HD resolution, and is global
shutter CMOS.
I have two groups looking at making this device, although it’s not
certain either will move forward. Need to press on this. - Algorithms. If we’re using texture projection, SGBM or something
similar but simpler may work very well, although we still need to work
on getting edges sharper, which is the primary limitation of
texture-projected stereo.
[older email]
- If we could coordinate on this project so we agree on what we’re
trying to produce, that would be great. - Large-scale stereo is very difficult, requiring
advances in a number of fronts, including calibration. It’s not just
a question of computational power.- We should talk with the NASA/Ames
folks, who have some experience with this sort of thing in doing their
DEM work. I did briefly when I was there; unfortunately their
situation is much better than ours, in terms of having great texture,
and being almost front-planar and within a restricted disparity range.
- We should talk with the NASA/Ames
- There are other groups doing large-format stereo, I think one at the
U of Delaware.
Release:
- Tickets
- All of the ’’patch_tickets have been closed along with the out of date tickets
- OpenCV forum, there are bug reports.
- Coverage — no new tests
- C++ docs
- Found in PDF, so problem is not intrinsic
- User contributed code
- Doxygen docs
- Another contributor is working on doxygen comments for OpenCV in a separate branch
- Each contribution
- Top level directory called cv_contrib
- Subdirectory for each contributin
- include and src
- CMakeList.txt
- include code will have root CMakeList.txt
- black list
- separate svn for cv_contrib
- Doxygen docs
- Carbon bindings for HighGui in Mac need to be put back (build problems)
- Other than that, the release is in pretty good shape.
Descriptors:
- Patrick’s post, works for descriptors that are vectors, but not to multi-vector
- Also non-standard metric matchers (intersection)
- Victor’s approach does not give an explicit representation of a descriptor
- More generic, just
- Compute descriptors
- Classifify points
- Has subset of vector descriptors
- More generic, just
- Add SIFT as User contrib Alex Boverin
- New proposed interface is: http://www.ros.org/doc/api/outlet_detection/html/classcv_1_1DescriptorMatchGeneric.html
Vadim
The work on the release continues.
- Several tickets have been closed (## 92, 121, 130, 131, 136, 147, 160, 162, 167, 190, 196). Work on a few more tickets is in progress [Vadim, Maria]
- Draft version of installation package for Windows has been built and sent to James. Besides a few minor glitches it seems to install to work fine Vadim
- Calonder descriptor code has been cleaned up, the sample find_obj_calonder.cpp has been added Alexey
- Because of quite a few problems with building and using OpenCV relate to OpenMP (in particular, at least open 4 trac tickets relate to OpenMP), it’s been decided to drop OpenMP and replace with TBB (StereoBM has been already ported to TBB sometimes ago by Anatoly).
- Last week Haar/LBP object detector, HOG detector and Distance transform have been converted to TBB.
- In CMake scripts TBB detection part has been improved as well.
- It’s planned to convert the remaining OpenMP code in vision algorithms (SURF, sparse optical flow) to TBB before the release Vadim
Plans for the this week:
- close some more tickets,
- fix and extend documentation,
- test installation and source packages packages on various platforms,
- finish preparations to the release.
Victor
- Descriptor interface:
- Implemented support for SURF in the new descriptor interface.
- Now switching from one way to SURF for outlet detection is a matter of changing 3 lines of code.
- The interface has been restructured and documented, doxygen-generated doc is here
- http://www.ros.org/doc/api/outlet_detection/html/classcv_1_1DescriptorMatchGeneric.html
- Implemented support for SURF in the new descriptor interface.
- One way:
- the ROS version has been synchronized with opencv version (latter was outdated, former contains several bug fixes as well as support for kd-tree search).
- Other:
- NNSU university has started working on the implementation of Gradient Boosting Trees (GBT) and Felzenschwalb’s latent svm.
- They will be designing both within opencv framework and the intention is to submit the code to opencv once it is in good shape (at least the single-threaded reference implementation).
- They already have a stripped implementation of GBT that now uses opencv decision trees (while making the move to opencv data structures they found a bug fixed by Maria).
GPU
- Several attempts to optimize Joe’s stereo implementation with CUDA were made. We measured that texture fetching (left and right image) is the most time consuming part of the algorithm (about 50% drop in running time if we comment fetching out) and reading min SSD from global memory is the second part (30% drop in running time after commenting out). Fetching uses texture cache, so we have no success with optimization here. We tried to use shared memory for caching min SSD, but performance also degraded. The reason was that each block was consuming much more shared memory and as a result occupancy decreased.
- We have found source codes of two other stereo GPU implementations. One of them is quite interesting (http://www.cs.unc.edu/~gallup/stereo-demo) and we studied it in details (and adopted for work with larger resolutions). Implementation minimizes global memory accesses by using shared memory for storing image’s tiles. But 16kb of shared memory limiting parallel execution of blocks and this implementation is useful only for medium resolution images (up to 900×750).
- We developed GPU implementation based on calculation of integral sums (squared differences actually). Although this implementation will consume a larger amount of memory (about 1GB), it is supposed to load the memory bus more effectively (and as a result scale better on Quadro cards that have 10x memory throughput compared to the cards we use now). We have the first version working and are optimizing it now.
Action Items
Gary
- Define code contrib. process
- Post to user group about where to post bugs
- Send Vadim example user contrib
Vadim
- Create cv_contrib with example make files etc
Victor
- See about encouraging a user contrib of sift
James
- C++ functions in wiki doc
From last time
Gary
- Define code contrib. process
- Pros and cons of user documentation
- Ask James about converter
Vadim
- Define code test system for user contrib
- Define “other” directories for patented and contrib code
Victor
- Ask James about status for release
James
Agenda
Release
- User contrib oi vey
- Can release and patch
- Can go to google news and set a filter for ‘’OpenCV bug’’
GSOC
- go through list… make list, rank list
Going forward
- Big effort on texture based recognition.
Next
- Release: March 31st
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.* Stereo
- Combine Daisy like idea, or:
- A good overall reference to stereo algorithms is:
- http://www.vision.deis.unibo.it/smatt/seminar_stereo_vision.html
- Bilateral filter followed by a local smoothing method:
- http://www.vision.deis.unibo.it/smatt/lc_stereo.htm
- Some similarities with MOPED from CMU, but
Minutes
- Bad problem with C++ documentation. It is a real functionality barrier not to be able to find all functions
- Names have changed cvSmooth gaussianblur etc
- twhurl, splitweet
- Probably should update ros cv_bridge to use cv::Mat instead
- http://www.ros.org/wiki/cv_bridge/Tutorials/UsingCvBridgeToConvertBetweenROSImagesAndOpenCVImages
- IplImage* imgMsgToCv(sensor_msgs::Image::ConstPtr image_message, string cv_encoding=“passthrough”)
**
- Test system
- We have our own
- Switch to using Google Test? http://code.google.com/p/googletest/
- Future
- Texture based feature
- Work on dataset so we can create the “solved problems in vision” challenge.
- Anybody
Victor
-
Outlet detection:
- tested the new geometrical model on all available datasets with 2×1 white outlets. The results show that we can detect outlets without one way descriptor, by matching edges (chamfer distance) and regions together with keypoints.
- Average detection time is 30ms.
- The new method is better than one way on low quality images (low contrast, low resolution), and worse on high quality images.
- Both methods have very low false alarm rate (less than 1%).
- The new method can be generalized for other textureless planar objects such as european outlets.
-
Descriptor interface:
- Implemented support for kd-tree nearest neighbor search for all descriptors that can be represented by a vector.
- Interface has been improved.
- A sample showing visual odometry task (matching between two sets of keypoints) has been implemented.
-
Other:
- planning for GSoC project.
CUDA
- Accomplishments:
- Integral Images.
We have implemented several versions of block-matching based on integral images (see draft foils). Performance of the current version is 1.2sec per # 7MP frame (1800×1500) on NVidia GT220 (versus 1.38sec by our tuned implementation from OpenVIDIA). We have more ideas on how to optimize this implementation — have started to work on them. - Other activities:
- GPU memory access experiments (texture fetching, global memory access for different data types).
- Reading Hirschmueller GPU paper (Ines Ernst and Heiko Hirschmuller (2008), Mutual Information based Semi-Global Stereo Matching on the GPU, in Proceedings of the International Symposium on Visual Computing (ISVC08), 1-3 December 2008, Las Vegas, Nevada, USA).
- Joe’s suggestion for parallel execution with several disparities discussed. Some preparations for implementation were made.
- Plans:
- Optimize Integral Images implementation.
- Start to implement Joe’s suggestion to process several disparities simultaneously (for better texture cache utilization).
Vadim
The work on the release almost finished (although we would not mind spending some more time on documentation and bug reports).
- Several tickets have been closed (## 86, 93, 122, 123, 128, 182, 194, 201, 211, 214, 218, 219, 221, 222, 226). Work on a few more tickets is in progress [Vadim, Maria, Alexey]
- All of the OpenCV threaded code, except for the old-style haartraining application and part of the new traincascade application, have been converted from OpenMP to TBB, in particular, sparse optical flow, SURF Vadim and decision tree engine in MLL Maria. The new TBB code demonstrates pretty good threading efficiency.
- Written documentation on several cxcore classes Vadim, on the cascade detector Maria and on the Calonder descriptor Alexey
- Scanned the mailing list at Yahoogroups for the bug reports since 2010 January, one severe bug in FindHomography has been found and fixed [Alexey, Vadim]
- The OpenCV build and test framework (buildbot-based) has been improved (added some hooks for Python tests on Windows) and optimized to spend 1 hour less on the complete build & test procedure. Several new test failures have been reproduced and put to the tracker (tickets #29 and 113) [Alexander]
Plans for the nearest 2 days: close some more tickets, finish most important parts of the documentation, test the binary and source packages on various platforms, finally, do the release.
Action Items
Gary
- C++ documentation
- Get Vadim a contrib function asap with mocked out test code
- Send Victor email to see him in beginning of May in US
- Send Vadim email about CVPR
Vadim
- Set up user contrib
- Renew visa to cover CVPR
Victor
- Renew visa to cover CVPR
James
- C++ documentation
From last time
Gary
- Define code contrib. process
- (./) Post to user group about where to post bugs
- Send Vadim example user contrib
Vadim
- Create cv_contrib with example make files etc
Victor
- See about encouraging a user contrib of sift
James
- (./) C++ functions in wiki doc
Agenda
Release
- Release report
- User contrib oi vey oi vey oi vey
cv_patented
-
SURF
- http://www.faqs.org/patents/app/20090238460
TBB
- GPL with runtime exception
GSOC
- OpenCV on Android/meeting with streetview team
- Refine projects
- Go through lists
Going forward
- Big effort on texture and feature based recognition.
- We need to unify the interface between what’s used in ROS and in OpenCV
- Other:
- Face detection and face pose for Texai
- HD stereo.
- Image stabilization
- What needs doing in VO
Next
- Release: March 31st
- Segmentation: Might get as intern(?) —
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Code (Malab and windows at least).
-
Active Segmentation with Fixation. Ajay Mishra , Yiannis Aloimonos, Cheong Loong Fah, iccv 2009
- Texture recognition toolbox
- Some similarities with MOPED from CMU, but
- not dependent on SIFT
- cloud capble
- accommodate Stereo up front
- contours
- flexible objects
- New texture features:
- Combine Daisy like idea, or:
- Scale Invariant Feature Transform with Irregular Orientation Histogram Binning, Yan Cui, Nils Hasler, Thorsten Thorm?hlen, Hans-Peter Seidel
- With patch duplets described in:Patch-Duplets for Object Recognition and Pose Estimationby ‘’Bjorn Johansson and Anders Moe’’
- Tested in A Performance Evaluation of Local Descriptorsby ‘’Krystian Mikolajczyk and Cordelia Schmid’’
- With binarized features as described in_Multiple target localization at over 100 fps BMVC, 2009, S. Taylor and T. Drummond.* Stereo
- Combine Daisy like idea, or:
- A good overall reference to stereo algorithms is:
- http://www.vision.deis.unibo.it/smatt/seminar_stereo_vision.html
- Bilateral filter followed by a local smoothing method:
- http://www.vision.deis.unibo.it/smatt/lc_stereo.htm
- Some similarities with MOPED from CMU, but
Minutes
- Release # 1 is out
- Need to update http://opencv.willowgarage.com/wiki/OpenCV%20Monthly
- Not fully announced yet
- http://opencv.willowgarage.com/wiki/Welcome/Introduction
- http://opencv.willowgarage.com/wiki/OpenCV%20Change%20Logs
- Create contributed directory
Detector Descriptor Interface
- [Vic] Likes the approach of explicit representation of data
- See doxygen here http://www.ros.org/doc/api/outlet_detection/html/classcv_1_1DescriptorMatchVector.html
- How would Nistor trees fit into this?
- At each level of the tree, you need to compute distances
- Is there a notion of distance or score in Victor’s scheme
- Thinking of confidence function, ratio of first to second as in Lowe’s use
- Wasn’t thinking of distances since don’t appl
- Can introduce notion of distance to any function — a function that returns a vector of distances, not just class decisions (existing integer vector implementation in vic’s code)
- Separate object for each node would make Nistor tree cumbersome
- Vocab tree by Patrick is templated
- Nistor stay with explicit representation
- Changing matchers
- Classify vectors function can be used to choose brute force or index approach
- Function to retrieve a matrix of descriptors.
- What about query descriptors? Trained descriptors are stored, how to get run time match descriptors?
- 2 interfaces? features_2d
- Single object that computes features with explicit representation
- If a feature maps into a fixed dimensional space
- Use normal metric
- Then features_2d
- But if it’s something like array of decision trees
- Then go with generic interface
- Use normal metric
- 2 interfaces descriptor computation + matching glued together
- vs. Descriptor computation and generic matcher separate
- To compare everything, use wrapper for feature_2d to combine desc. compute and match so you can use general one for both
- In descriptor
Vadim
- OpenCV # 1 has finally been released!
Here is the announcement in the development mailing list: http://bit.ly/afgckK
I decided not to rush with a broad announcement, to let the most active people test the package first. It now seems to was a good decision to postpone the release by a few days, as several bugs have been fixed and the bug tracker has been cleaned a bit more. - We closed 24 trac tickets (## 32, 63, 82, 107, 117, 127, 138, 153, 161, 168, 195, 198, 207, 208, 231, 233, 236, 237, 240, 241, 244, 245, 246, 252) and more than 12 tickets at SF bug tracker [Vadim, Alexander, Maria].
Victor
Object recognition:
- refactoring of geometric matching code.
- Generation of hypotheses has been separated from matching so that different transformation classes (affine, perspective, etc) can be used together with different hypotheses generation algorithms (exhaustive search, hashing, RANSAC etc) for matching different features (points, edges, regions).
- Outlet detection is using the new interface, object detection sample is on the way.
- Implemented support of the new keypoint detection interface (from features_2d) for MSER and Harris corner detector. The code will be committed to OpenCV after the release Maria
GPU
Accomplishments:
- NVidia GPU stereo implementation has been optimized to 0.79sec/# 7Mp (original runtime was 1.9sec). Disparity loop unrolled (for better texture cache utilization) and type conversion eliminated. Alternative Integral Images GPU stereo implementation has been optimized to 1.08sec/# 7Mp by loops unrolling and avoiding bank conflicts. Both implementations uses plain block-matching, so later we may incorporate pyramids and some runtime-saving tricks (ie disparity step = 2).
- We started new stereo implementation based on Joe’s suggestion to keep image tiles in shared memory (SMEM). But we keep SSD columns in SMEM for different disparities (it takes SSDWindowSize * MaxDisparity * sizeof(int) of SMEM). The initial version is quite memory and time consuming (about 6sec/# 7Mp), and we need to work on it. Probably it is better to avoid column SSD computation and rather use SMEM for image tiles storing.
- Remap and undistort for a stereo pair have been implemented. Results: 60fps/2Mp (1920×1080) for both left and right images on GT220.
All the experiments performed on NVidia GT220 with MaxDisparity = 224 and SSDWindowSize = 19 on 1800×1500 (# 7Mp) image.
Plans:
- Experiment with SMEM approach.
- Probably start to experiment with a pyramidal approach and some other tricks.
- Create dataset with actual camera resolution.
- Start to work on Hirschmueller SGM algorithm.
Action Items
Gary
- Ask Kurt what the needs are in VO.
Vadim
Victor
- Iterate on interface with Patrick
Patrick
- See Victor above
James
From last time
Gary
- :( C++ documentation
- X-( Get Vadim a contrib function asap with mocked out test code
- (./) Send Victor email to see him in beginning of May in US
- (./) Send Vadim email about CVPR
Vadim
- :( Set up user contrib
- /!\ Renew visa to cover CVPR
Victor
- /!\ Renew visa to cover CVPR
James
- :D C++ documentation
Agenda
-
GSOC
- Today: slot allocation
- April 18st is next deadline — students selected
- See to do list
- See the timeline
- CVPR
- Solved problems in vision
- Grouply site offer
Minutes
- Transalate bgfg_segm.cpp into C++
- Create cpp directory inside of samples
- Vadim created connected_componenets.cpp and morphology# cpp
Next
- Python release
- Code contribution process into its own directory … high priority
- CVPR
- VO (Victor)
- Detector descriptor pipeline
- Improve camera calibration … yet more for more stable distortion parameters
- Use edges and other geometric figures
- It already allows for arbitrary 2D and 3D patterns, needs tools for finding different patterns
- Allow “infinite” grid (chessboard that goes off camera)
- Improving distortion
- Show “infinite” board
- Detect all fiducial points/corners (could fit curve) Try to optimize all the points
- Most information is at the edge
- Most of the improvement will be in finding better, subpixel edges
- Can use results from LePetit — very fast PnP solves 1
- Barcodes
- Approximate NN
- k-means in FLANN, is opencv’s k-means faster — question for Marius
- Same kd tree
1 F.Moreno-Noguer, V.Lepetit and P.Fua , Accurate Non-Iterative O(n) Solution to the PnP Problem, IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil, October 200# (with code in Matab and C++)
Grouply
- Problems:
- They decide to start charging us later
- They go out of business
Solved Problems in Vision
- Will launch (probably) at ECCV workshop
Vadim
- Some minor maintainance work after # 1.0 release has been done. 19 tickets have been closed (## 52, 108, 135, 189, 209, 224, 239, 251, 258, 261, 269, 270, 272, 273, 274, 276, 277, 278, 280).
- While the number may seem huge, given that # 1 has been just released, in fact most of the bugs have been either already fixed in # 1.0, or are duplicate bug reports, or are quite minor problems. The only two important bugs:
- cvHaarDetectObjects crashed when no face grouping is made (parameter minNeighbors=0) and # OpenCV build failed on MacOSX 10.6 when using command line makefiles (but Xcode built it fine). Vadim
- Created opencv/samples/cpp directory where we put samples using the new C++ interface. There are 2 samples for now: morphology# cpp (rewritten morphplogy.c sample) and connected_components.cpp. Vadim
Victor
-
Object recognition:
- Experiments with descriptor interface are in progress. The two interfaces (explicit representation of descriptors and a more generic API) can be merged together.
- The main issue now is inconvenience of using one way descriptor for VO — it lacks flexibility in adding/removing descriptors.
- A bug in one way descriptor has been corrected. [Victor]
-
Keypoints:
- Keypoint detection has been ported from features_2d to OpenCV and enhanced with MSER and Harris corner detector.
- A sample of using keypoint detection and descriptor matching has been implemented.
- A testbench for comparing the performance of keypoint detectors and descriptors is in progress. Maria
-
Image stabilization
- Several methods for image stabilization have been investigated. The paper 1 is the candidate for implementation. Alexey
1 Ken-Yi Lee et al, Video Stabilization using Robust Feature Trajectories, ICCV’09.
GPU
- Both versions of GPU stereo implementations (NVidia and Integral Images) have been tested on Quadro FX 5800. NVidia’s version demonstrates # 0fps, Integral Images version – # 7fps. This numbers are for the highest resolution of Silicon Imaging camera (2048×1152), MaxDisparity=256 and SSDWindowSize=19. Speedup ratio on Quadro is about 5x, proportional to the increase of the number of multiprocessors (30 on Quadro vs 6 on GT220). We have made some new minor performance optimizations, but they gave us only 5% of improvement in runtime.
- Pyramidal version of NVidia GPU stereo block matching have been implemented. It uses only one additional pyramid level (1/2 in width and height) because we try to avoid errors on coarse levels. First version demonstrates 8fps (on Quadro) without any quality degradation. If we use smaller SSD window (11 instead of 19) during second pass (disparity refinement) we can achieve 20fps without visible disparity map degradation. But we plan to find ways to avoid errors caused by the pyramidal approach.
- Also we tried to implement Joe’s suggestion to keep image tiles in SMEM. But this approach for our MaxDisparity and SSDWindowSize values requres too much SMEM and unsuccessful therefore. MaxDisparity x SSDWindowSize = 256 × 19 = 4864 bytes per block only for the left image tile and single pixel per block. Also procedure for choosing the minimum disparity between different threads takes to much time (about 0.9sec on GT220 even after avoiding bank conflicts). So, this approach looks unperspective for us.
Plans:
- Experiments with pyramidal approach.
- Experiments on Fermi card (if we get it this or next week).
Action Items
Gary
- Write Vincent about epnp license
- Write sample cv_contrib
Vadim
- Set up cv_contrib
Victor
Patrick
James
From last time
Gary
- (./) Ask Kurt what the needs are in VO.
Vadim
Victor
- Iterate on interface with Patrick
Patrick
- See Victor above
James
Agenda
- GSOC — 8 slots
- Texture rec toolbox
- VO
- Reorg of directory structure
- Adding user contrib there
Going forward
- CVPR Demo
- Solved problems in vision
- Improved camera calibration
- Better descriptors
- Implement 1
- 2D bar codes
- 1D bar codes
- Infinite grid
1 F.Moreno-Noguer, V.Lepetit and P.Fua , Accurate Non-Iterative O(n) Solution to the PnP Problem, IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil, October 200# (with code in Matab and C++)
Minutes
-
GSOC
- Looks good for 8 slot, but final decisions on April 26th
- Had 50 applications, figure out how to nicely tell the others they weren’t selected. Many are very good, maybe as interns next year etc.
- Pays to look over the 50 slots
-
Reorg of OpenCV directory structure
- Can be done in 2 weeks
- We’re waiting for this to move descriptors in ROS to OpenCV
- Other problems might happen due to the change
- Maria working with descriptors inside ROS
- Build bot
- Can we go in steps?
- Move several folders to different locations
- Vadim prefers to do it all in one step to get user contrib started
- Step 1: Enable the module structure (fix folder structure)
- Step 2: Divide up cvaux while Maria works on moving over the toolbox
- Vadim wants to do in branch first, then in trunk
- Discuss this offline, since we want to finish descriptors are high priority but don’t want to create havok with change.
-
VO
- 2 images and point cloud from 2 frames
- Matches 2 frames using STAR and Calandor descriptor (ferns) using X,Y,Z
- Then does bundle adjustment over all the frames
- Output is reconstruction of camera trajectory and location of 3D points
- Walls found well
- Victor is trying to enable the same thing for monocular cameras
- Estimate translation of camera for point to point (SFM)
- Assume some points lie in plane.
- Easier Faugera paper, 4 pt method estimates Essential matrix and R,T.
- Closed form to 1 of 8 possible configurations, use visibility constraints to eliminate back of camera solutions
- Then use epipolar projection error on other points to resolve final view
- Ransac
- 4 random points
- Compute R,T with 4pt algorithm
- Compute epipolar reprojection error on all points
- Chose best
- This initializes monocular correspondence between 2 frames
- Then use sparse bundle adjustment to improve 3D point location and R,T
- Compute 3D point cloud error using extrinsics and intrinsics
- cvTriangulatePoints — This doesn’t show up in the docs
- Created ticket for this #302 Gary
- Somewhat related to Gary’s stitching project in GSOC
- Use all this to initialize a “STF pipeline” or “STF Stack” in OpenCV
-
Texured object toolbox
- [Alexander C.] has a dataset from Romain T. at Willow with different objects on a turntable
- Use this to do a demo at ICRA and/or at CVPR.
- Cropped images from the bag files
- Match test image to one of the objects using bag of features
- Then move to object pose
- Use stereo data for 3D using SolvePnP()
- We’ve suffered from too little texture here [Anatoly/Suat]
- 3D capture is different, here we’re going for pose
- Recognize proposed object list
- Given the point cloud in a reference image
- Reproject points to get confirm or reject
- Test bench for detectors is already in place
- Tests stability of the detectors
- test set with known homographies
- see how well the detectors match between planes
- distort images and see how well the points stay in place
- all opencv detectors work with this test bench
- Tests stability of the detectors
-
CVPR Demo
- Visa problems for CVPR? Vadim hasn’t applied yet … time is getting short.
- Do you have to go to US Embassy or not — Vadim not expired, so maybe OK
- Run ICRA demo
- Stephan doesn’t want to show his work until ECCV
- Try to show our version of “MOPED” there to get recognition and pose in clutter.
-
Solved problems in vision challenge
- Might use our version of MOPED
-
Improved Camera Calibration
- Markers in OpenCV that are very robust to detect
- Vijay was using the chessboard in the robot arm. Gets failures
- We’re collecting the failures which tend to occur in
- low lighting
- bright backgrounds
- We need to correct this but:
- We’re collecting the failures which tend to occur in
- We want a blob based marker that can be detected robustly anywhere
- Quality of sub-pixel refinement … chessboards aren’t great for this … up to 0.5 pixel error
- Better sub-pixel makes a big difference
- Design for few false alarms and then for grid. Geometric matching then is robust.
-
Barcodes
- 2D and 1D barcode reading …
- Post CVPR
- Problem with eigen
Vector4d a;
Vector3d b = a.start(3);
Mat c;
eigen2cv(b, c);
you will get garbage in c. Vadim has reproduced this and is looking into it now.
- Fixed now:
- these are updated convertors (also put to SVN). It now should work fine. The problem was that by default Eigen uses column-major data layout, while I expected it to be row-major.
New proposed directory structure
opencv/
. 3rdparty/ # various 3rd-party libs, like libjpeg, clapack etc
. data/ # some pre-trained classifiers
doc/ # tex-based documentation for the key modules; will be retained
include/opencv # header files for backward compatibility
samples/{c,cpp,python} # samples in different languages
modules/ # source code for all the modules
. core/ # former cxcore, cut-down [quite] a bit – drawing and some other functionality will probably be moved other modules
. include/opencv2/core/.h # header files without cx or any other prefix
src/ # sources and internal headers
test/ # core tests
vision/ or imgproc/ # former cv or a part of it
. include/opencv2/vision/.h # header files without cv or any other prefix src/ test/
ml/
. include/opencv2/ml/.h src/ test/
highgui/
. include/opencv2/highgui/.h src/ test/
features2d/ # 2d keypoint detectors and descriptors from cv & cvaux
. include/opencv2/features2d/, src/, test/
background_segm/ # background segmentation from cvaux
. .. # also include/opencv2/background_segm/, src/, test/
vidsurv/ or blob_tracking/ # vs part of cvaux
. ..
visual_odometry/ or 3d_reconstruct/ # another part of cvaux
. ..
python/ # former interfaces/python
. ..
ffmpeg/ # former interfaces/ffmpeg
. ..
traincascade/ or traincascade_app/ or app_traincascade # former apps/traincascade
. ..
old/ or obsolete/ or depreciated/ # some obsolete functionality
. ..
utils/ # some helper utilities
correspondingly, each module binary will be named as opencv_, e.g.
- libcxcore.so* → libopencv_core.so*
- cv220.dll → opencv_vision_220.dll
- libcvaux.dyld* → libopencv_features2d.dyld*
- libopencv_background_segm.dyld*
- …
then, in order to add a new module, you will need to put the module directory to opencv/modules and add the module name to some global list in opencv/modules/CMakeLists.txt. That’s it.
Vadim
- The minor maintainance work after # 1.0 release continues. 6 tickets have been closed (## 271, 286, 287, 289, 291, 292) + other two bugs, reported by mail – bug in DFT, bug in drawContours, bug, have been fixed Vadim
- added 2 new C++ samples – contours# cpp (reimplementation of contours.c) and the incomplete version of object segmentation sample Vadim
- our OpenCV buildbot-based build system was fixed to build 64-bit OpenCV with mingw-w64 and successfully run Python tests on 64-bit Windows [Alexander
- started investigation of the googletest and doxygen for the sake of better OpenCV and especially cvaux modularity, to allow users to contribute new code that would have been automatically documented and more easily tested.
Maria Dimashova
Texture descriptor test bench
- I implemented the test on detectors repeatability. Some information about this test is in
- opencv_extra/testdata/cv/detectors/detector_repeatability_test.txt.
- You may see the results of the test (i.e. the location and region repeatability of detectors) in
- opencv_extra/testdata/cv/detectors/algorithms/[detectorName]_res.xml files.
- The location repeatability is small in some cases, may be because of inadequate detector parameters fitting. As regards the region repeatability, I implemented it for case of scale invariant detectors only (the region is a circle). But there is a region repeatability criterion for an affine invariant detectors (the region is ellipse) in Mikolajczuk papers too. It seems that OpenCV detectors are scale invariant only. Affine normalization based on the second moment matrices is described in Scale & Affine Invariant Interest Point Detectors, Mikolajczuk. This procedure transforms the circular region of key point into the elliptic region. Affine normalized detectors have higher region repeatability in case of significant images transformations, because the scale change is, in general, different in each direction.
- I think it would be useful to implement this affine normalization, not only for the detectors repeatability test. We can get more stable descriptors for such affine normalized detectors. Am I right? What do you think about this?
Action Items
Gary
- “Finish” off gradient techniques
- Collect clutter data for our version of MOPED
Vadim
- (./) Check the Eigen-cv::Mat image hang problem
Victor
Patrick
James
From last time
Gary
- :-( Write Vincent about epnp license
- :-( Write sample cv_contrib
Vadim
- ~ Set up cv_contrib
Victor
Patrick
James
Agenda
- GSoC: students set. Any issues?
- OpenCV directory structure re-org
- 2D feature set
- Test system set up with data
- Unclear what Suat is doing. Compatible?
- VO
- CVPR
- GPU
- 3D box capture latest opencv_extra, go to opencv_extra/3d/tracker3D
Going forward
- CVPR Demo
- Solved problems in vision
- Improved camera calibration
- Better descriptors
- Implement 1
- 2D bar codes
- 1D bar codes
- Infinite grid
1 F.Moreno-Noguer, V.Lepetit and P.Fua , Accurate Non-Iterative O(n) Solution to the PnP Problem, IEEE International Conference on Computer Vision, Rio de Janeiro, Brazil, October 200# (with code in Matab and C++)
Minutes
-
GSOC
- Features with Nicholas, Victor sync
*_2D Features_ - Maria finishing up test bench for the 2D
- Then on to affine adjustment of features
- Features stuff might be a good CVPR demo
- Detectors are in OpenCV, all of them
- Can use test bench
- Descriptors are now in cvaux … don’t know if these are checked in yet
- Allow comparision against SIFT … how to link to it?
- Allow configuration to include “external” code
- Alexander working with mono matching. Set of descriptors matched together
- Move on to case where have 3D view to mono recognition and pose
*_Tracking a 3D box relative to a chessboard_
- Move on to case where have 3D view to mono recognition and pose
- … https://code.ros.org/svn/opencv/trunk/ – contains both opencv and opencv_extra
- Features with Nicholas, Victor sync
-
Re-org directory structure
- Modules build
- Before check in, test and sample build too
- Finish this week
- Then check that the ROS packages still build
-
VO
- Status not changed.
- Planar SFM
- in poseest package
- wrapped pose estimator 2d package as a plug in. Should work then in VSLAM package
- If so, can reconstruct point clouds etc
- Hope to finish by next week
- Suat doing this with stereo pose
- Status not changed.
Victor
Visual odometry:
- Implemented structure from motion algorithm that consists of RANSAC on top of planar structure from motion (faugeras & lustman), the best configuration is selected by the minimum average epipolar projection error. The algorithm works on scenes that have significant planar parts. A test on simulated data showed that the algorithm is capable of finding a correct solution with 20% points lying in the same plane, with 1 pixel noise in image projections and 5% wrong point matches. Subsequently sparse bundle adjustment is applied to refine both point cloud and extrinsic parameters. The code is in posest package, I am in the process of wrapping it into pe::PoseEstimator2d object that will be integrated into vslam [victor]
Object recognition toolbox:
- a testbench for detectors stability has been implemented according to the approach of Mikolajczyk et al “A comparison of affine region detectors” IJCV’06 paper. The testbench evaluates location and region repeatability for all available detectors (fast, gftt, harris, mser, star, surf). The testbench works with features_2d::FeatureDetector interface so any detector implementing this interface can be added for testing. Descriptors interfaces (DescriptorExtractor, DescriptorMatcher, DescriptorMatchGeneric and their descendants) have been ported to OpenCV cvaux. They will be moved to a special package after Vadim finishes with OpenCV structure reorg. Maria
- An experiment with recognizing textured objects from turntable dataset by matching descriptors between training and test images has been performed. A training dataset consists of several cropped images of each object. For each descriptor in a test image we find the closest descriptors in the training dataset and assign the test image a label of an training object that has the most number of matches. Matches with confidence (defined as [distance to the second closest neighbor]/[distance to the closest neighbor] – 1) lower than a fixed threshold are filtered out. This method gives 100% accuracy on all textured objects (descriptors from background are filtered out by confidence) but the choice of confidence threshold might depend on data and experiment on a larger dataset is needed. [Alexander]
-
GPU
- Frame work to increase form 11 to 14fps on Fermi on 2048×1152 disparity 256, 19 window size
- Seems this will be good enough for CVPR demo
- 2 Fermi cards might get this to frame rate
Vadim
The big task being performed last week and continued this week is OpenCV directory structure reorganization. The goal is to make OpenCV more modular and more friendly for the future extensions and users contributions. The current state (not committed to SVN yet) is that the following modules build flawlessly:
core (former cxcore), imgproc (most of the former cv functionality), highgui, ml, legacy (most the former cvaux functionality), as well as the new specialized modules: tracking (optical flow, motion templates, video surveillance part of cvaux), features2d (the new framework that is being implemented + SURF + StarDetector + FAST + MSER), objdetect (haar + hog + ferns), background_segm (3 algorithms from cvaux) calib3d (epipolar geometry + camera calibration + pose estimation) flann (bindings for FLANN)
Plans:
- finish the reorganization, test everything, submit to SVN
- add Doxygen support to OpenCV build system (automatic documentation generation from OpenCV headers)
- add Googletest (or CxxTest or …?) support, write 1-2 sample tests.
GPU/Kirill Kornyakov
- Both versions of the stereo BM algorithm implementations were optimized for Fermi.
- NVidia’s implementation shows 10.8fps and Integral Images implementation shows 11.8fps.
- This numbers are for our Fermi card and the same 2048×1152 image, MaxDisparity = 256, SSDWindowSize = 19.
- NVidia’s implementation shows 10.8fps and Integral Images implementation shows 11.8fps.
- We still maintain two implementations because they are based on different ideas, but they influent each other and this competiton moves us toward.
During this work we’ve tested several optimization approaches. Among them is SMEM approach, which also was not succesful (we discussed it earlier). And right now we have no significant ideas for further optimization and plan to switch to the algorithmic improvements.
Plans:
- Algorithmic improvements of the stereo implementation.
Maria Dimashova
- I implemented the test on detectors repeatability. Some information about this test is in opencv_extra/testdata/cv/detectors/detector_repeatability_test.txt.
- You may see the results of the test (i.e. the location and region repeatability of detectors) in
- opencv_extra/testdata/cv/detectors/algorithms/[detectorName]_res.xml files.
- The location repeatability is small in some cases, may be because of inadequate detector parameters fitting.
- You may see the results of the test (i.e. the location and region repeatability of detectors) in
- As regards the region repeatability, I implemented it for case of scale invariant detectors only (the region is a circle). But there is a region repeatability criterion for an affine invariant detectors (the region is ellipse) in Mikolajczuk papers too.
- It seems that OpenCV detectors are scale invariant only. Affine normalization based on the second moment matrices is described in “Scale & Affine Invariant Interest Point Detectors”, Mikolajczuk.
- This procedure transforms the circular region of key point into the elliptic region. Affine normalized detectors have higher region repeatability in case of significant images transformations, because the scale change is, in general, different in each direction.
- I think it would be useful to implement this affine normalization, not only for the detectors repeatability test. We can get more stable descriptors for such affine normalized detectors. Am I right? What do you think about this? …[Gary things “yes”]
Action Items
Gary
- Talk with Kurt about integrating Suat’s work
- Get object images
- Get more data
Vadim
Victor
Patrick
James
From last time
Gary
- (./) “Finish” off gradient techniques
- :( Collect clutter data for our version of MOPED
Vadim
- (./) Check the Eigen-cv::Mat image hang problem
- ’’in progress_Re-org directory structure
Victor
Patrick
James
Agenda
- Gary is at ICRA 2010 in Achorage. Hope to still hold meeting
- OpenCV and image meta-data. Some discussion with Sarnoff (soon to be absorbed into “SRI”)
- Maybe we can get them to open/contribute some of their video stitching
- Object Recognition
- Progress update on features_2d and PnP matching. Stereo to stereo, stereo to mono
- Sidd wants to contribute:
- MOPED: A Scalable and Low Latency Object Recognition and Pose Estimation System ‘’Martinez, Manuel, Collet, Alvaro, Srinivasa, Siddhartha. ICRA 2010.’’
- Basically, we could replace their dependence on SIFT with simple use of any detector-descriptor
- OpenCV re-org into to functionality stacks.
- Pose estimation should just work across functions (chessboard needs to also put out R,T matrix and quaternion representation. Work more easily with TF from ROS.
- 3D box relative to calibration pattern — better integrated/higher priority in object recognition stack.
- OpenCV Cheatsheet. C++ and Python. Could be very useful.
Minutes
- Test code
- Cxx test engine
- Google unit test
- Possibility of Open source parts of videobrush from Sarnoff?
- Object recognition
- 100% filter out bad matches with low distance between best and 2nd best match [Alexander]
- May not generalize. Need more feature
- Add pose estimation to validate result
- Close to MOPED, we may get MOPED contributed (talking with Sidd at ICRA)
- Sync up with Suat
- See about getting enough feature points on the object for ransac to work
- 100% filter out bad matches with low distance between best and 2nd best match [Alexander]
- Stereo
- Camera info from ROS
- Hard to convert to OpenCV intrinsic params
- Need translation 3D point to pixel
- Camera info from ROS
- Suat
- Suat has a wiki page on his work product:
- http://pr.willowgarage.com/wiki/Stereo_Object_Recognition
- Currently he’s getting stereo-to-stereo keypoint object recognition (MOPED) working, hopefully first results by Friday.
- Next is to integrate vocab trees from Patrick for fast pre-filtering over large object databases, and monocular-to-stereo keypoint recognition using Victor’s 2D/3D matching in package posest.
- Using features_2d,
- frame_common, posest, sba, and vslam_system packages seems to be paying off in allowing us to collaborate efforts.
- Suat has a wiki page on his work product:
- Camera info between ROS and OpenCV
- Add flag
- With pose make it easy to use any type to any other type
- Rodrigues, R,T and to quaternion
Vadim
The re-org of directory structure now looks like:
opencv/ include/opencv/ # headers for backward compatibility modules/ # decomposed cxcore, cv, highgui, ml, cvaux: core/ # core functionality imgproc/ # image processing part of cv highgui/ # GUI ml/ # machine learning routines video/ # optical flow, motion templates, Kalman filter, blob tracking, background/foreground segmentation calib3d/ # camera calibration, epipolar geometry, stereo correspondence features2d/ # 2d feature detectors & descriptors objdetect/ # Haar/LBP & HOG object detectors, fern-based planar object detector legacy/ # obsolete functionality user_contrib/ # user contrib # some other stuff python # new-style Python interface ffmpeg # ffmpeg bindings traincascade # Haar/LBP training application samples doc ...
In addition, we should create “OpenCV Central” like Matlab Central.
- cxcore.h, cv.h and other old-style headers include the new style headers, so, in theory, no changes at compile stage are required from user side. However, because of renamed modules, the appropriate changes need to be made at linking stage.
- In Linux this is handled by the new opencv.pc.
- The whole OpenCV is now builds fine. The new directory structure will be committed to SVN within a couple of days.
Plans:
- Commit changes to SVN
- Adjust makefiles for OpenCV-based modules in ROS
- Add support for Doxygen & googletest.
Victor
Object recognition toolbox:
- Experiments on recognition of textured objects from dataset showed 100% accuracy after filtering of ambiguous correspondences (by comparing the minimum distance to the second closest distance).
- A PnP pose estimation (from training stereo data and test image data) is in progress. [Alexander]
- DescriptorMatchGeneric wrappers of calonder and fern descriptors have been implemented Maria
- SIFT (imlementation of Andrea Vedaldi) has been added to OpenCV. A DescriptorExtractor and DescriptorMatch wrappers have been implemented. Maria
Visual odometry:
- fixed several bugs during integration of PoseEstimator2d into vslam. The integration is still in progress. [Victor]
Other:
- May 3rd is a holiday in Russia.
- Victor was traveling to Moscow Apr 30 for US visa renewal.
- Alexey is leaving Itseez in May. We thank him for the work he did and wish him a successful career! We are interviewing several candidates now.
Action Items
Gary
- Talk to Scott about a “OpenCV Central” web site
- Contribute focus detector
- Substantial data set for textured objects and pose
- (./) Find out what Suat is doing, send coorinating email
- sync Victor with Sidd about MOPED
Vadim
- Write test system example to user_contrib
- Start OpenCV cheatsheet
Victor
- sync with Suat
Agenda
- Multiple calibration models for calibration and stereo rectification
- Resize RGB vs YUV … where the time is spent
- YUV<—>RGB conversion back and forth test
- Support for projected fringe pattern to get a depth pattern
- New calibration patterns
- CVPR
Depth from Phase or Gray Pattern
- The big idea is to use a projector to throw bands of light onto the scene.
- Some techniques do a relatively simple gray-code on binary projector images:
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.21.5393&rep=rep1&type=pdf
- Fancier ones use a sine-wave and phase-shift it each frame:
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.12# 6857&rep=rep1&type=pdf
- These guys talk more about calibration:
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.11# 5314&rep=rep1&type=pdf
- Structured light from phase patterns Structured_light_phase_2006_Zhang_High-resolution 3D Zhang Siggraph 2006
- Some techniques do a relatively simple gray-code on binary projector images:
Minutes
- Calibration already allows multiple models
- Resize image
- YUV can be in many formats Y full, U&V half resolution in horizontal or vertical or both
- Convert→RGB→Resize→YUV
- Unless each plane is separate, then you can call resize on each plane
- Projected fringe pattern code
- Post CVPR
- New calibration patterns
- Infinite chessboard
- Blob chessboard
- 3D offset box
- Descriptor toolbox
- Romain dataset, 7 objects
- Using ROS, calc stereo for training image, calc point clould, subtract table and background
- Estimate finish in a day or two
- Obj rec demo
- Given a training set with pan tilt, recognize test image and find pose
- Matching descriptors in test image with training set descriptors
- Finding images where there are best matches
- Solve PnP using stereo using Ransac
- Test data for cluttered scene
- Need to find in clutter
- Moped clusters descriptors, localized in space
- for 7 objects, works at 640×480. Want to scale to 100 objects
- Set of existing objects is at http://vault.willowgarage.com/wgdata1/vol1/IROS/bags/textured/
- The package for pan_tilt table is dp_ptu47_pan_tilt_stage
-
CVPR
- Victor: Visa 3rd or 4th week of May. Can be in SiValley June 6th
- Vadim: About the same schedule ~June 6th
- NVidia
- Stereo Block Match
- 30fps with 2 cards
- Middle next week, will try to implement Hirschmuller … would be nice to get for CVPR
- Demo: 30fps high res stereo with Hirschmuller
- Original algorithm is not easy to parallelize
- Demo: 30fps high res stereo with Hirschmuller
- Stereo
- Hirschmuller 10x slower than BM on CPU
- Needs lots of texture since not all context used
*
- Needs lots of texture since not all context used
- Hirschmuller 10x slower than BM on CPU
Vadim
- Still testing directory structure overhaul.
Victor
Mono visual odometry: implemented a plugin of PoseEstimator2D to vslam framework. First pair of frames is initialized with SFM and every next frame uses 3D point clouds from previous frames to solve PnP problem. Spent the most of the time solving small problems. I am still debugging the code now. [Victor]
Object recognition toolbox: implemented automatic segmentation of object mask from background and pan-tilt table. Implemented a RANSAC version of solvePnP, it is in debugging stage. [Alexander]
Detectors/descriptors testbench: spent some time looking at Mikolajczuk’s MATLAB code detectors/descriptors quality estimation and making OpenCV testbench closer to it. Implemented a new area-based metric for estimating detectors invariance. Implemented a function for estimating descriptors matching quality. Maria
Other: four new candidates interviewed for a replacement of Alexey. We are going to make an offer to one of them this week.
Action Items
Gary
- (./) Send Dallas resize explantion
- Close out YUV→RGB conversion bug
- Ask about hotel accomodatation
- Get dataset to Victor for object
- Individual objects
- In clutter
- In refrigerator
- Talk to Scott about a “OpenCV Central” web site
- Contribute focus detector
- Substantial data set for textured objects and pose
Vadim
- Write test system example to user_contrib
- Start OpenCV cheatsheet
Victor
- Send Gary existnig textured data link
From Before
Gary
- :( Talk to Scott about a “OpenCV Central” web site
- :( Contribute focus detector
- :( Substantial data set for textured objects and pose
- (./) Find out what Suat is doing, send coorinating email
- :( sync Victor with Sidd about MOPED
Vadim
- Write test system example to user_contrib
- Start OpenCV cheatsheet
Victor
- (./) sync with Suat
Agenda
- Textured object recognition
- Test system
- Cheat sheet
- Android OpenCV (from GSoC)
- Stitching interface
- New calibration patterns
- Future
- Support for fringe patterns
-
CVPR
- NVidia
- Robots
- C-turtle
- Stereo
Depth from Phase or Gray Pattern
- The big idea is to use a projector to throw bands of light onto the scene.
- Some techniques do a relatively simple gray-code on binary projector images:
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.21.5393&rep=rep1&type=pdf
- Fancier ones use a sine-wave and phase-shift it each frame:
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.12# 6857&rep=rep1&type=pdf
- These guys talk more about calibration:
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.11# 5314&rep=rep1&type=pdf
- Structured light from phase patterns, See May 18 2010: ‘’Zhang Siggraph 2006’’
- Some techniques do a relatively simple gray-code on binary projector images:
Minutes
- Need to get team members able to commit to ROS on Willow’s site
- Current algorithm works for all 5 objects, except 2 of them have non-textured sides
- Need data now
- Textured object recognition
- Some detectors are in features_2d in outlet detection
- module called features_2d that implements everything. It’s in trunk
- has detectors, descriptors and training net
- Moped approach is being implemented as a ROS package
- Model needs stereo keypoints in 3D, keypoint match, spatially cluster keypoints (one object), ransac and geomeric check
- Keypoints are found
- clustered spatially (per object)
- Ransac to find pose (PnP)
- Geometric check
- Follows Moped pretty much, use different kind of clustering, use arbitrary detectors/descriptors. Don’t match point clouds, we use PnP
- Questions
- Does it work with other detectors and descriptors
- STAR and SURF are used right now
- Plan to try with SIFT also
- Need to try Calander
- Multi-scale haris
- Multi-scale FAST
- Does it work with other detectors and descriptors
- Suat
- Need to get data now — this is the bottleneck right now
- SURFALS Surface patches in mesh
- Stereo
- Wavy pattern in stereo … half pixel issues.
- Rectify 2 images half-pixel offset. Take the best match but never at a half pixel off, at most a quarter pixel off
- Hirschmiller, Vadim used the Birchfield matcher — computes the minimal difference within half-pixel of each location (interpolation)
- Link is Stereo by Birchfield and Tomasi
- Hirschmiller, Vadim used the Birchfield matcher — computes the minimal difference within half-pixel of each location (interpolation)
- Use large and small windows at the same time.
- Rectify 2 images half-pixel offset. Take the best match but never at a half pixel off, at most a quarter pixel off
- Wavy pattern in stereo … half pixel issues.
- Eigen interface
- Header files.
- Alignment issues
- Now have conversion wrappers so can mix and match Eigen
- will put out doxygen comments
- C-turtle, 23rd of May.
- Using trunk of OpenCV with ROS
-
ROS trunk should try to track OpenCV Trunk
- Trunk (most recent) Is # 1+, but not very latests
- Released version (latest, a little behind) —> C-turtle
- Box turtle (old stable)
- VO
- Bugs in monocular VO
- Prints point clouds to consol
- SFM spoiled to rviz
- Some transform SBA (a lot of programs use the transform from world to camera, but we use the coords of camera in the world)
- Run stereo vs mono
- (Full vslam now works with loop closure) Bundle adj of all the points, subsecond times
- Android
- Give Ethan OpenCV SVN access to debug the code
- Then maybe a month or so later we can merge to trunk
Victor
Visual odometry:
- debugging the code.
- Fixed several bugs, including a major bug in computing inliers.
- Right now the point cloud gets corrupted somewhere in between initial structure from motion estimation (R and T look quite sensible too) and rviz where the point cloud does not make any sense. [Victor]
Object recognition:
- refactoring of ransac version of PnP.
- Experiments on a larger test dataset show stable recognition of Green tea, Pattern classification and tuna salad, and unstable recognition on Blue people and Campbell soup__when__they are viewed from untextured sides.
- Implemented hierarchical clustering and included it into the pipeline for recognizing several objects in the same scene
-
MOPED uses mean shift segmentation that we don’t have and it will take some time to implement it,
- we have k-means but it wouldn’t work on this type of data — a single cluster with scattered outliers, hierarchical clustering was easy to implement and it is a good proxy for mean shift on this type of data.
- The attachments contain the results of clustering of keypoints matched to a single object and the inliers corresponding to the best solution of ransaced PnP among all clusters.
- We urgently need data with several textured objects in the same scene to test on. [Alexander]
Detectors/descriptors testbench:
-
MOPED uses mean shift segmentation that we don’t have and it will take some time to implement it,
- implemented a helper for computing descriptors that can choose keypoints either from applying a detector or by projection from another image with a given homography.
- There is an option of filtering keypoints based on confidence.
- Tests for surf and sift descriptors are implemented. A problem with sift crashing has been resolved. All changes are merged into opencv trunk.
Other: Ilya Lysenkov will start working for Willow project instead of Alexey from May 24th.
Vadim
- Finished reorganization of OpenCV directory structure, fixed some critical problems caused by the reorganization.
- Studied Doxygen, integrating it to OpenCV build system and added some experimental documentary comments to OpenCV headers. As a result, now Doxygen detection and execution is now integrated to OpenCV CMake build system. HTML docs (currently covering most of cxcore) are automatically generated.
Here is how cv::Mat documentation looks like (to get the proper look, save all the 3 files into the same directory).
Plans:
- finish with Doxygen.
- add Googletest + some experimental tests
- make a short document on how to prepare user contributions.
Action Items
Gary
- Get other team members on ros commits
- Get data to Victor
Vadim
Victor
- Send James message about OpenCV revision to go into C-Turtle ROS
From Before
Gary
- (./) Send Dallas resize explantion
- Close out YUV→RGB conversion bug
- (./)Ask about hotel accomodatation
- Get dataset to Victor for object
- Individual objects
- In clutter
- In refrigerator
- Talk to Scott about a “OpenCV Central” web site
- Contribute focus detector
- Substantial data set for textured objects and pose
Vadim
- Write test system example to user_contrib
- Start OpenCV cheatsheet
Victor
- Send Gary existnig textured data link
Agenda
- features2d
- Test program
- Example of how to add
- OpenCV and ROS … good to go?
- User contrib
-
CVPR
- 6/15/2010 17:20—21:00 Robot Vision and Manipulation on Willow Garage’s PR2 (Tuesday)
- 6/16/2010 17:20—21:00 OpenCV on NVidia Steriods (Wednesday)
- Calibration pattern generalization
- 1 and 2d bar codes
- Pipeline for getting homography from marked points
-
GSOC results
- Android port … trouble with cmake
- Stitching
Depth from Phase or Gray Pattern
- The big idea is to use a projector to throw bands of light onto the scene.
- Some techniques do a relatively simple gray-code on binary projector images:
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.21.5393&rep=rep1&type=pdf
- Fancier ones use a sine-wave and phase-shift it each frame:
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.12# 6857&rep=rep1&type=pdf
- These guys talk more about calibration:
- http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.11# 5314&rep=rep1&type=pdf
- Structured light from phase patterns, See May 18 2010: ‘’Zhang Siggraph 2006’’
- Some techniques do a relatively simple gray-code on binary projector images:
Minutes
- Features2D
- Algorithmic test
- Victor to send test instructions
- Can we get homography to represent a change in view point of “X” degrees — this is good for testing purposes
- R,t surface normal and distance to surface. And returns the homography
- Plane and rotation and translation
- perspective warp
- R,t surface normal and distance to surface. And returns the homography
- Textured object detection
- Looks like enough resolution on the new stuff, but didn’t get bags, got images, so no 3D
- Suat workking on — combine things into a single point cloud/object
- Want to have him working on creating 3D models
- Documentation finished tomorrow
- 2 different things:
- Images in — visualize matches
- What Kurt wants
- Stereo images with camera parameters
- Have test set and training set
- Train on training set, makes viewpoint based model
- Run on test set and see how you do
- Stereo images with camera parameters
- R and T in different form
- polymorphic
- seq, cvmat
- In c api, tried to make functions accept different forms of matrix
- Python
- numpy arrays standard
- Functions are polymorphic, so could work with
- Tickets to make C and C++ more polymorphic
- Doxygen
- Simplify documentation of new functionality
- Old functionality would be shallow
- 1 or 2 documentation systems?
- Contributed functionality →Doxygen
- New functionality →Doxygen
- Only for user contrib
- Stick with Latex→Sphinx
- OpenCV and ROS
- Headers in ROS packages
- cv.hpp cvaux.hpp caused problems
- Victor added proxy headers so that packages will compile
- Still one package: Outlet pose
- Need to change outlet pose, not enough to just put in proxy headers
- Headers in ROS packages
- Work on CVPR tutorial when Vadim gets in
- Cheatsheet
- Vadim June 4th
- Victor …
Maria on testing 2D features
- I implemented the test on detectors repeatability.
- Some information about this test is in opencv_extra/testdata/cv/detectors/detector_repeatability_test.txt.
- You may see the results of the test (i.e. the location and region repeatability of detectors) in
- opencv_extra/testdata/cv/detectors/algorithms/[detectorName]_res.xml files.
- The location repeatability is small in some cases, may be because of inadequate detector parameters fitting.
- As regards the region repeatability, I implemented it for case of scale invariant detectors only (the region is a circle).
- But there is a region repeatability criterion for an affine invariant detectors (the region is ellipse) in Mikolajczuk papers too.
- It seems that OpenCV detectors are scale invariant only. Affine normalization based on the second moment matrices is described in
- “Scale & Affine Invariant Interest Point Detectors”, Mikolajczuk.
- This procedure transforms the circular region of key point into the elliptic region.
- Affine normalized detectors have higher region repeatability in case of significant images transformations, because the scale change is, in general, different in each direction.
- It seems that OpenCV detectors are scale invariant only. Affine normalization based on the second moment matrices is described in
- But there is a region repeatability criterion for an affine invariant detectors (the region is ellipse) in Mikolajczuk papers too.
- I think it would be useful to implement this affine normalization, not only for the detectors repeatability test. We can get more stable descriptors for such affine normalized detectors.
Vadim
- “Doxygenated” the following OpenCV modules (C++ part only): core (finished), imgproc, calib3d, video, features2d, objdetect, highgui.
- Started integration of googletest into OpenCV project tree.
Plans:
- finish with Doxygen, improve the appearance of the docs, experiment with the search
- add some experimental googletest-based tests
- make a short document on how to prepare user contributions.
Victor
Object recognition: bag files captured by Radu show low texture/resolution, there are not enough correct matches.
The dependence of correct matches on resolution has been investigated: images of “Pattern Classification” book and green tea box ware taken by a consumer IP camera from different distances and inliers were found by ransac+homography calculation. The results are presented below. 1 meter is a critical distance after which homography calculation is unstable for the book that is textured. Tea box is detected from 0.5m only for 640×480.
Results: Inliers count distance 640x480 1280x720 about 1m 25 44 about 0.7m 68 86 about 0.5m 109 220 Similar experiment with Greenfield tea. 640x480 1280x720 2 6 0 7 8 15
An algorithm for automatically extracting training images from a bag file is implemented. It incrementally adds new frames that have low inlier count with existing training set. The documentation is in progress. The new package name is textured_object_detection.
Detectors/descriptors: implemented support for angle recalculation in sift wrapper (similar to surf). As a result sift descriptor shows better matching that surf on test data. Fixed problems with setting keypoint scale KeyPoint::size for several descriptors (sift, gftt, mser).
OpenCV/ROS: fixed compilation problems for several ROS packages induced by a new opencv structure. Added a patch for opencv that fixes the most of compiling problems for tags/latest revision of packages.
Kirill/GPU
Accomplishments:
- Current code ported to the NVidia Driver API and debugged. Multi-GPU version have been created, it shows 19.5 fps on Fermi+Quadro configuration, where Fermi does 75% of work and Quadro 25%. We measure work time of only GPU matching, other stuff like color conversions and visualization is out of consideration.
- Some activities with the DemoCVPR application:
- OpenCV stereo BM and SGBM (Hirschmueller) added. Switching in realtime between algorithms implemented.
- Information window created (performance, stereo parameters, resolution).
- Refactoring in order to get more flexibility, first of all for integration with camera code. - Started to think about porting Hirschmueller and Daisy (http://cvlab.epfl.ch/~tola/daisy.html) algorithms to the GPU. Here we need to estimate time needed for implementation.
Plans:
- Tune the multi-GPU version.
- May be integrate with Joe’s code, that receive frames from cameras and do deBayer transformation.
- Design GPU versions of the Hirschmueller and Daisy, probably start this work.
Action Items
Gary
- Test out user contrib process
- Contribute focus detector
- Contribute anysize pyramid/laplacian code
- Close out YUV→RGB conversion bug
- Hotel accommodations for Vadim
Kurt
- Send HD stereo with calibration camera parameters
James
- Eitan to turn calibration R&T translation difficulties into tickets
Vadim
Victor
- Send Gary a instructions on how to use the features2d test program
From Before
Gary
- ? Get other team members on ros commits
- (./) Get data to Victor
Vadim
Victor
- (./) Send James message about OpenCV revision to go into C-Turtle ROS
Agenda
- Matching
- VO
- cvpr
- svn
Minutes
Matching
- At high resolution, FAST+SURF is working fairly well for doing pnp ransac verification
- Alexander and Maria on this
- Data collection
- Taking Tea box through all rotations in video to try to get model
- Dataset at http://itseez.com/data/canon-base-1.0.tar.gz
VO
- SFM generates decent pose estimation
- But full pipeline is producing bad results in rviz
- Wim wants VO for Texai which only has monocular
- Have wheel odometry to help
Documentation
- Thinking of possibility of writing directly in Sphinx
- Victor still no visa
- Vadim on the 4th
Victor
Object detection:
- created a dataset with a stereo pair of Canon 40D cameras. Each image has been scaled down to 1280×853, a training set of 3 objects has been created. Initial experiments show promising results: there is a lot of keypoints and matching works well enough for ransac+pnp to find a sufficient number of inliers. Some examples of inliers and correspondences are attached. [Alex]
Detectors/descriptors:
- a new sample keypoints_matching.cpp has been implemented. For a single input image it generates its random perspective transformation and matches descriptors from original and transformed images. It can optionally calculate inliers with ransac+findHomography. clear() method has been implemented for !GenericDescriptorMatch interface. Maria
OpenCV:
- there has been a number of issues with building ros packages against the latest opencv, including deprecated headers and several changes around flann namespace. All of that has been fixed and the new opencv has been released together with vision_opencv 1.# # [Victor]
SVN
- Eliahoo can’t
- Create account
- request an account
- can’t check in
- Nathnan not too responsive about this
- Brian asked
Vadim
Here is the progress:
- A short break was made with Doxygen expansion after some problems with it have been discovered. In particular, Doxygen does not distinguish between these 2 functions:
//! computes convex hull for a set of 2D points. CV_EXPORTS void convexHull( const Mat& points, vector<Point>& hull, bool clockwise=false ); //! computes convex hull for a set of 2D points. CV_EXPORTS void convexHull( const Mat& points, vector<Point2f>& hull, bool clockwise=false );
- because Point and Point2f are different specializations of the same template class Point_<>. It looks like similar problems have been reported a couple of years ago in Doxygen bug tracker.
- Now considering RST (Sphinx) as a possible alternative.
- Some OpenCV .tex files have been fixed to produce cleaner HTML OpenCV docs without garbage unprocessed “:math:`…`” strings. Also it’s now possible to build PDFs out of the generated rst files, so in theory the rst can be the original format.
- That is, instead of (latex→pdf & latex→rst→html) we can produce both online and offline documentation from rst (via rst→html & rst→latex→pdf).
- The ancient OpenCV mechanism for reading and writing “registered” data types, such as CvMat, CvHistogram, CvSeq etc. has been updated to support the new C++ structures more easily using cv::RTTIImpl<> class. cv::HOGDescriptor has been used to test this. This functionality is going to be used for the currently implemented descriptor toolbox.
- 5 tickets (various patches from users) have been closed (## 330, 355, 356, 359, 360).
Action Items
Gary
- Get Vadim a place to stay
- (./) Complain about Ilya Lysenkov not being able to get commit to SVN
Agenda
- CVPR
- Features 2D (Introduce at CVPR Tutorial
- Documentaton
Minutes
-
CVPR
- 10 to 15 foils for Features 2D
- Have people do sample code, add to it
- 10 to 15 foils for Features 2D
- Demo at CVPR
- Finds pose of object in clutter
- Feature 2D
- New dataset http://itseez.com/data/canon-base-1.1.tar.gz
- Joe will run NVidia demo
- Demo runs on one Fermi card (90ms/frame) correlation + post processing on CPU
- Fermi and Quadro (slower card)
- Expect 20fps ultimately
- Can get working in parallel — process on GPU, post process on CPU in tick tock fashion, probably 10ms per card speedup
- Integrate with real camera
- Object Recognition
- Scaling
- Clutter
- Interface for window based search (Patrick did)
- Called Features2D grid Adapter class
- Sparse descriptors in dense framework
-
GSOC, got progress
- Victor:
- Whole pipeline running for Pascal dataset using bag of words
- Gary:
- Port of OpenCV to Android with build system nearly done using Android make system
- Shperical Image stitching and blending version 1 pretty much done
- Vadim:
- Implemented GUI part of HighGUI in QT
- Victor:
- In features_2d, we have the grid_adapter.h class
#ifndef FEATURES_2D_GRID_ADAPTER_H #define FEATURES_2D_GRID_ADAPTER_H
#include <features_2d/detector.h>
namespace features_2d {
/**
- \brief Adapts a detector to partition the source image into a grid and detect
- points in each cell.
*/
class GridAdapter : public FeatureDetector
{
public:
GridAdapter(const cv::Ptr& detector, int max_total_keypoints,
int rows = 4, int cols = 4)
: detector(detector),
max_keypoints(max_total_keypoints),
rows(rows), cols(cols)
{
}
cv::Ptr detector;
int max_keypoints;
int rows;
int cols;
protected:
virtual void detectImpl(const cv::Mat& image, const cv::Mat& mask,
std::vector& keypoints) const;
};
} //namespace features_2d
#endif
grid_adapter.h (END)
Vadim
Here is the progress:
- Several bugs (findHomography(), cvLoadImage() …) have been fixed (trac tickets: ##304, 338, 369, 370)
- OpenCV cheatsheet is in progress. It’s being done in Sphinx. The current very initial version is attached.
Victor
-
Object detection:
- created a new dataset http://itseez.com/data/canon-base-1.1.tar.gz, consisting of 10 textured objects. The dataset is captured with Canon 40D stereo pair and all images are resized to 1280×85# [Alexander]
- Experiments on high resolution images show good recognition rate for all objects except for ‘stapler’ (that is textureless) and ‘diploma’ (is being investigated now).
- An example of detection is attached (test image and rviz point cloud, each color corresponds to an object: roze – 3M disks box, yellow — fallout cd, white — pcmcia adapter box). Experiments on low resolution (640×480) show unstable matching and recognition, close to what we saw on bag files captured by Radu. [Maria, Alexander]
*_Visual odometry:_ - Implemented a test system for the whole mono pipeline. The test system feeds sba::voSt object with keypoints obtained by projecting a simulated point cloud to simulated camera positions. Several bugs have been corrected and the pipeline works correctly on simulated data now. A test on new_college shows reasonable camera trajectory (straight line diverging to the right) and point cloud (visible ground plane, although noisier compared to stereo) on the first several seconds and then both become obviously corrupt. [Victor]
- OpenCV: added one way descriptor into descriptor testbench. Modified one way descriptor interface for a more convenient parameters input/output (single file instead of several). [Ilya]
Kirill/GPU
Accomplishments:
- DemoCVPR performance measuring and tuning. Whole processing pipeline initially run with 8.5-10fps instead of 16fps (matching alone) on a single GPU. We are working on optimization of the whole flow. Currently we run speckleFiltering (the most time consuming CPU procedure) in parallel with other routines and it gives 1# 5fps with a single GPU.
- Working on user manual for CVPR demo. Document will be finished soon
- Small activities:
– fixed bug in GPU-kernel – sometimes disparity used to round to nearest multiple of 8.
– experiments with Middlebury 2005 and 2006 datasets.
– work with config improved (config name received as cmd parameter, class AppSettings created, some settings renamed).
– tried moving winsize from preprocessor definition to variable (next step is to move to the config). Have got significant performance decrease due to stop of loop unrolling by compiler. So, this activity was cancelled.
Plans:
- Integrate with Joe’s code that works with SI camera.
- Finish user manual.
- Clean DemoCVPR.
Action Items
Gary
- Find out about Texai use at CVPR
- Talk to Matai about integration if we can get Radu’s images working
- Talk to Joe about NVidia demo
- Make sure Marius knows about Patrick’s work
Radu
- Get objects in database to Victor
Victor
- Write Features2D documentation and tutorial
From last time
Gary
- (./) Get Vadim a place to stay
- (./) Complain about Ilya Lysenkov not being able to get commit to SVN
Agenda
- Test system
- User contrib
- Features2D
- Obj Rec
- Testing framework
- VO
- Object Rec stategy/goals
- Generalizing calibration/feature points
Minutes
Features2D
- Lowered threshold and switched off non-max suppression on FAST
- Get 5-10K points per image instead of 10s
- Upped the recognition
- 100’s of inliers per object up from like 3
- Looking at better clustering techniques:
- When find correspondence for test descriptors
- Look for cluster of descriptors corresponding to the same object and near each others
- Run ransac on the clusters only
- Forward backwards check:
- Image A and B. Take point in A and find correspondence in B. Do the same thing from B to A. If not, drop.
- Ratio test from SIFT features
- Ransac is the bottleneck right now. Should be fast if number of iterations is low, but for us its high
- Moped doesn’t do ransac, they randomly chose 5 correspondences, if low inliers they throw away. Do it withing object cluster
- Need some ways of speeding up
- Prefilter
- Plans
- Recognition rate test with differing number of ransac and interations
- Forward backward check
- Ratio test
- S. in touch last week, replicating Moped
- Alexander doing this, Maria does this but on vacation.
Mono VO
- Victor getting occasional loss of tracking
- Choice of keyframes is important
- Again turning down threshold on FAST and using forward back check
- Smoothing prior will help
- Helps with small number of inliers too
- Basically VO works on short sequences
- Incorporate changes to SBA that Helen has been making
- Almost ready to play with
-
PTAM can get pretty good results on table top tracking
- PTAM might have a relocalizer in it. They also have trackers on each point
- We’ll be matching frames instead.
- Want to compare Mono against
Test system
- Are not reporting what failed
- OpenCV vs ROS: Cannot just report what test failed since we run across several operating systems
- Either remove test temporarily and issue a Ticket
- Go to Trac and can see which test were failing
- Haven’t implemted a log function to be OS specific
- Hudson would be more convenient
- For now, black list and raise a ticket.
Stereo
- Vadim run on stereo data using Hirshmuller
- Victor: NVidea wants dense stereo
Object Recognition
- Marius is putting together a recognition pipeline to the level where a non-computer vision coder can train objects
- Scaling is an issue
- Getting it over 10 objects
- Pre-filter, recognizer and verifier
- Large dataset in same format
- Using Romain’s data format
- 1Mpix camera, have little data
- Can deal with messages not being there (pan tilt unit)
- Make a public data set test bench
- Pairs of line features
General Calibration Object detector
- Victor will send time estimate.
Victor
- Textured object recognition: a considerable improvement in recognition of objects from Radu 640×480 dataset http://vault.willowgarage.com/wgdata1/vol1/objects_pantilt_database/ is caused by lowering the threshold of fast detector. This resulted in thousands keypoints per image instead of hundreds, and it looks like repeatability rate was the main reason for a small number of inliers. A side effect of this is a lower ratio of inliers to the total number of matches so we have to use a much higher number of ransac iterations (10-100K instead of 1K). Below are some examples of recognition and pose estimation. [Maria, Alexander]
Figures
- Image recognition and pose:
- Inlier points
- Pose
- Shape recognition: PAS features have been integrated into textured_object_detection and first experiments with estimating pose of a textureless object were carried out. While there still are lots of issues (clutter, edges occlusions, a high ratio of inconsistent edge correspondences), there are some promising results (see images below) for objects that have low texture. [Ilya]
- Line pairs for recognition
- Visual odometry: brought the code up to date with the latest changes in voSt and sba. There still are issues on the full new_college sequence, the tracking is lost after some time due to a small number of inliers. Several experiments with keyframe selection showed that PTAM method (based on average distance to the camera) is preferable to other empirics. [Victor]
Vadim
The last week was spent in fixing various bugs:
- tickets ## 347, 388, 413
- multiple crashes in the tests because of some CvMat::type and CvSeq::flags bit field reorganization. At the same time, XML/YAML I/O functions for sequences have been modified to avoid dependency on the binary representation of the structures.
- compile errors with GCC # 1 on Ubuntu 8.04 (by disabling precompiled headers)
- incorrect image rendering and memory leak in waitKey() (reported by Tandent engineers)
Plans:
- do experiments with Hirschmuller algorithms on high-resolution images
- distribute latex documentation by modules
- video stitching?
Action Items
Gary
- Write Tamir for clarification about devices in November whether the data is open for others to use. Better vision camera in conjunction with it?
_Vadim_’
Victor
- Get estimates of time for plannar homography calibration patterns.
Kurt
- Find out what Suat is doing
From last time
Gary
Vadim
Victor
Agenda
- Get estimates of time for general_plannar homography calibration patterns. (?)
- VO timeline
- Feature object rec
- NVidia plans
- Latex docs … bugs
- User contrib
- Test system
- Object recognition plans
Minutes
Plannar calibration patterns
- Outlet detection/blob detection from ROS into OpenCV to build a calibration object.
- Given a pattern, can find calibration pattern
- Would take 2 days to a week
Bar codes
- 2D barcodes would be nice, James has wrapped a library in Python
- Don’t know if we should implement
Mono VO
- Testing on camera around table
- Quite a few parameters, trying to understand how they influence results
- Sometime tracking is lost since all inliers are lost
- Implement regularization?
- When tracking is lost, we’re lost
- Initializes with SFM and then uses SolvePnP
- If SolvePnP fails, we stop. We should regularize by going back to SFM (no scale) and regularize the scale by interpolating camera motion then back to
SolvePnP
- We don’t have place recognition/loop closer yet, but it is in place in stereo slam
- Need to unify the external files for vocab trees. They are hard coded, they should be downloadable.
- Patrick has already put these on Vol1.
- Stereo VSLAM seems to be working now. Good results on outdoor data
- Smoothing paper was sent for motion.
Object Recognition
- Need data
- Create a flexible “tool kit” for future recognition needs
- Currently, our detector-descriptor recognition approach is:
- The training set consists of object images together with 3d points computed from stereo. Stereo is used to find object mask. Keypoints are extracted from each object ( FAST, threshold=5, nonmax suppression is off) and descriptors (SURF) are computed.
- For each test image
a. Find keypoints and compute descriptors ~ 0.2 sec
b. Find the nearest training descriptors to each test descriptor (using L1 distance) ~ 30-80 sec
c. Label each test keypoint with an object id of the corresponding training keypoint
d. For each object id
1) cluster test keypoints with the same object id. Hierarchical clustering is used with a stop criterion on distance between cluster centers (max distance = 100). Test points with correspondences from different images are not allowed into the same cluster ~ 0.01 sec
2) Select clusters with keypoint count higher than a threshold (threshold=50)
3) Run RANSAC SolvePnP on each cluster (iterations count = 100) ~ 0.1-2 sec (depends on clusters count)
4) Each cluster with the number of inliers higher than a threshold (=50) is recognized as an object with the current object id.
Textured objects
- Object recognition working better, but takes 1 minute
- How to speedup?
- Indexing by KDtree or FLANN (but FLANN uses Euclidian and Victor uses L1 distance which works better)
- For keypoints to reference
- Pose recognition is faster now
- 13 objects, 5 to 10 poses
- Indexing by KDtree or FLANN (but FLANN uses Euclidian and Victor uses L1 distance which works better)
- How many reference stereo pairs are we checking. We use clustering so it’s probably around 10 clusters per object
- PrimeSense will be commercially available in December. We
GPU
- Merging block matching on GPU with OpenCV by implementing an API for this
- Make it work with Linux and ROS
- Comparing Hirschmuller with BM
- Preliminary results is Hirschmuller is better but not critically better
- Decision point of what to do next with Stereo
Stitching
Docs
- Split into modules
- Is there an easier way to temporarily enter changes?
Buy another server? I7
Victor
Textured object recognition: a new test base has been created, consisting of a training set of 13 objects and a test set of the same 13 objects with 20 test images per object http://itseez.com/data/pantilt_640x480.tar.gz. The latest results for this dataset of textured_object_detection are summarized in the table below. Only two objects are not recognized robustly (All and Blue people). Recognition has been speeded up from 5 minutes per test image to 1 minute. The most important change is a new version of clustering algorithm that does not merge test keypoints with correspondences from different training images into one cluster. As a result a lot of small clusters can be filtered out and pose estimation is done on few per test image. The current bottleneck is in matching, a new version of BruteForceMatcher that supports indexing for both L2 and L1 is on the way. [Alexander]
object name recognition count (the number of correct recognitions per 20 images)
tea | 20 |
All | 12 |
Blue people | 15 |
Coke-Cola | 19 |
Camplbells | 20 |
Golden Dragon | 20 |
Green tea | 20 |
Green tea with lemon | 20 |
Jasmine Pearl | 20 |
Naked | 19 |
Pattern Classification | 19 |
Ruby Red Chai | 20 |
Tuna Salad | 20 |
Shape recognition: Added many-to-many feature matching that filters correspondences by a distance threshold. Added filtering of correspondences by doing hough transform. This resulted in a better detection of a textureless object from Radu dataset (see the image below). This algorithm is sensitive to a silhouette of the object and thus the current version of pantilt_640x480 dataset cannot be used (it crops object edges). A manual operation will be needed to recreate this dataset with acceptable object masks and then an experiment on this dataset will be performed. [Ilya]
Visual odometry: mono visual odometry is integrated into the full pipeline so that stereo and mono can be run together without code recompiling. A table video sequence has been shot with Canon 500D in HD and resized down to 640×360. There are two issues: 1) after some time tracking is lost, like in new_college, 2) there are quite many points for which the distances to the camera where calculated incorrectly (they are much closer to the camera than they are), because a wrong match is occasionally considered as inlier (there are quite many keypoints used for tracking with FAST threshold varying from 5 to 10) [Victor]
OpenCV: after a period of active development buildbot shows a lot of problems on all systems. All build problems have been located and corrected (thanks to Vadim!), as well as some of the tests. Ubuntu32 builds and runs tests successfully now, some problems still remain in Ubuntu64 and Windows, as well as a technical problem in Rosbuilder. [Alexander, Vadim, Victor]
Vadim
The last week was spent in fixing compile problems and test failures:
- closed tickets ## 376, 415, 431, 440, 447, 449, 461 + the patch for OSX
- made OpenCV compile with GCC # x
- studied papers on belief-propagation-based stereo correspondence algorithms: by P. Felzenswalb and the later “constant-space” modification .
Plans:
- do experiments with Hirschmuller algorithms on high-resolution images
- distribute latex documentation by modules
- try BP-based stereo correspondence
Baksheev
Accomplishments:
- Created OpenCV GPU API draft. Began its implementation and integration with OpenCV.
- Investigated differences in results between OpenCV CPU and GPU implementations of Stereo block matching algorithm on Radu’s dataset with texture light from WillowGarage. Main reasons of the differences are that OpenCV uses Sobel with threshold and SAD instead of SSD. After doing that with GPU implementation, we got very similar results. I attached screenshots that demonstrate Sobel influence. I think we should add options to prefilter with Sobel, when we port to OpenCV GPU API interface. A comparison of block matching and Hirschmuller approaches on various stereo pairs has been performed. The details will be sent out in a separate message.
- DemoCVPR was ported back to Cuda runtime API (for further integration with OpenCV) and to Linux. Created CMake-files for it. So next we can integrate the code in textured_object_detection module in ROS.
- Read papers about Hirschmuller’s papers about its CPU and GPU implementation. Investigated approaches to implementing Hirschmuller’s stereo on GPU. We could parallelize different directions on grid level and inner loop by disparity on threads block level. First dumb implementation might take 3-5 days. After we will see all performance bottle necks and problems, and whether it could be faster on GPU than on CPU. Also we investigate belief propagation based matching method, which is slow but may be parallelized on GPU very good. A lack of BP is that we don’t have prototype that works on CPU with good quality.
Plans:
- Finish implementation of the API draft. (Some functionality requires writing GPU code for different image formats).
- Integrate stereo block matching on GPU into OpenCV.
Action Items
Gary
- Ask James about how well the 2D bar code is matching.
- (./) Ask Joe about GPU priorities?
- (./) Ask Steve about purchasing a new server
Vadim
- Think about is there an easier way to temporarily enter documentation changes
Victor
- (./) Send short written description of the texture feature algorithm and which parts are fast or slow.
- (./) Send quote for additional server
Kurt
- {?} Try VO on New Collage sequence.
From last time
Gary
- (./) Write Tamir for clarification about devices in November whether the data is open for others to use.(Nov_) Better vision camera in conjunction with it? (_Yes, possible)
Vadim
Victor
- Get estimates of time for plannar homography calibration patterns.
Kurt
- Find out what Suat is doing
Agenda
- Android port
- 3 or N camera stereo
- Documentation status
- Entering temporary changes
- User contrib
- Test system?
- Object recognition
- Test framework, features2D
Minutes
Android
- If all the make files can be put into a separate directory with instructions to build
- CMake is flexible enough to support Android “out of the box”, but maybe they will create Macros to make it easy and we can then support the Android port with CMake
3 camera stereo - We have 3 cameras that we need to calibrate together
- All 3 cameras could see a calibration, but we might want to make it more flexible so that you can calibrate N cameras with only pair-wise calibration
- But it’s not likely to generalize to N cameras
- Stereo calibration
- Each camera gets calibrated for intrinsics
- Then external calibration
- Then internal
- The image planes have to end up all in the same plane, so there isn’t
- Ideas
- Take images from 1 pair of stereo and the other pair of stereo and then take rotation and translation to solve
- Main pair is the one that sees the most views (?)
- Or take 2 pair with a common camera A with B and A with C. Two sets of R and T.
- Then A has to be consistent with B and C
- Fix A camera rotation so that the plane has the image centers of all of cameras
- Final goal is registration so that, let’s say A and B are for depth and C is for color image to be registered
- All images still need to be rectified so that we can get the correspondences for the pixels at different depths
- Epipolar geometry:
- A and B have to be stereo rectified to do stereo quickly (line search)
- Then want to ’’color_it with C
- Get pixel in A, know epiline in C, know its depth from A+B, find the point in C that corresponds to that depth
- Parallax will be off for close things
- Since we have 3 cameras in a row A on left, B on right, C close to A but on its Right
- To get rectification most easily: A and B set the stereo plane. Then C is set parallel to that plane and a scale has to be set
Victor out to the 26th.
Features2D object recognition
- Haven’t built and tried here
- Looks like it runs under ROS
- Testing framework
- Generic descriptor matcher demonstrates use of abstract interface for comparing keypoints. Just uses the match interface such as ferns
- The general tester is under …/tests/cv/src/adetectordescriptor_evaluation.cpp
- Goal
- list detectors and descriptors to be used and test them against a new detector or descriptor
Documentation
- Reorg in progress
- Can check in now
- Wiki page link where users can just type in documentation notes
- We then edit these changes into the Latex
Vadim
The last week most of the time was spent in fixing compile problems and test failures:
- closed tickets ## 99, 119, 334, 375, 377, 434, 435, 438, 448, 452, 45#
studied the following 2 papers about stereo correspondence estimation using belief propagation: - Efficient Belief Propagation for Early Vision. P. F. Felzenszwalb and D. Huttenlocher. IJCV Vol. 70, No. 1, October 200#
- Qingxiong Yang, Liang Wang, and Narendra Ahuja, A Constant-Space Belief Propagation Algorithm for Stereo Matching, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2010.
- The presentation about the algorithms was presented to Itseez group.
- The guys from NVidia project started implementing the algorithm on GPU (from the first sight the BP-based algorithms are much more scaleable than Hirschmuller SGBM algorithm).
- By the request from James, I implemented VNG bayer→RGB demosaicing algorithm in OpenCV: https://code.ros.org/trac/opencv/ticket/46# The current C version takes ~16ms to de-bayer 640×480 image (the simple bilinear algorithm takes ~1.5ms on the same image).
Action Items
Gary
Vadim
Victor
Kurt
From last time
Gary
- Ask James about how well the 2D bar code is matching.
- Ask Joe about GPU priorities?
- (./) Ask Steve about purchasing a new server. Just do it.
Vadim
- Think about is there an easier way to temporarily enter documentation changes
Victor
- (./) Send short written description of the texture feature algorithm and which parts are fast or slow.
- (./) Send quote for additional server
Kurt
- Try VO on New Collage sequence.
Agenda
- Documentation
- Catch up
- User additions
- User contrib …
- 3D Model capture
- Feature based object recognition
- OpenCV on Android
- General object recognition plans
Minutes
Documentation
- Need user link so that we can add in documentation
- Rate good tutorials to emphasize?
- Set up a how to guides on wiki
- Samples code should have a documentation page which indexes what is there
- Flann example should be updated to use features2d
- Can put in a help() function in each sample that is also its documentation
- Extra person on documentation would be good …
Bundle Adjustment
- 4 papers at ECCV mostly on speedps
Object recognition
- Working on dual contour fragments as descriptors Schmidt’s technique
- Can’t tell the sensitivity of MOPED from the literature
- Question is whether FLANN hurts performance or not
- Much faster
- Didn’t see much drop in recognition accuracy. Maybe 5% drop at worst
- Need to compare brute force
- Each training image against the scene
- Then run RANSAC to see if we can find that object
- Not so simple because of how to choose inlier outlier
- But this is for the geometric check, but typically it is not to hard to find a cutoff
- In previous efforts, choosing the best cluster, was nearly 100% recognition
- Look at wrong detections to see if filter out with geometric check
- One would think the strong geometric check would eliminate false positives
- Look at wrong detections to see if filter out with geometric check
-
MOPED takes a large (50-90) number of nearest neighbors
- Will partition among the objects
- Then perform ratio test
- Then a particular object can get at most one hit from each NN
- They are handling “match starvation” in this way
- In Lowe’s paper, the ratio test is used to keep down the false positive matches of features
- Kurt turned up K in kNN. But when K went beyond 10, the algorithm broke. Probably because no ratio test was used
- Current algorithm is sensitive to number of features and matches. When K goes from 1 to # We get get better matches, but more false positives because not using ratio test yet.
- Make sure in ratio test, that you don’t allow spatially overlapping descriptor areas to filter out matches. This will destroy good matches for no reason (they physically overlap).
- There is a ROS function that allows iterating through bag file images
- Green tea was giving 100% using largest cluster method (one object per image found)
- But even with geo-filtering, we got a false positive. This shouldn’t happen. Matching to shadow?
- Clustering: How does it work?
- Turn on showing the clustering, get lots of circles. What does this mean? Distance was defaulted to 100
- Joined keypoints that give same object view
- When kNN count went up, it slows down. Cluster size > 50, so it goes back to closer to brute force
- When k low, cluster size is low and so less checking
- Clustering is done hierarchically using greedy agglomeration.
- Moped clusters, does a weak ransac inlier check, split accordingly, and then they do careful ransac
- but single stage clustering should work when there is no occlusion
- Moped was trying to solve the problem of several copies of the object in the same image
- Notion: Match starvation may be a problem.
- Try Gary’s suggestion of no overlap might help
- We use 10 viewpoints per object. Moped merges to a single object. I’s like we have 10x number of objects because we have this
- MOPED uses fast, gets lots of features, but they use higher resolution images
3 camera stereo rectification (3 cameras inline)
- Vadim needs data, thinks he can make this work
- Calibrate the inrinsics parameters
- Extrinsics Make them orthogonal to baseline. Same as regular stereo
- Get a new image plane for the 2 cameras
- Rotate the third camera into this plane
- Works if optical centers are on same line
- OK if it’s off a little as long as the 3rd image plane is rotated to be parallel
- Then you’ll get a scale offset that will need to be adjusted since the plane will be ahead or behind the other 2 planes
- OK if it’s off a little as long as the 3rd image plane is rotated to be parallel
Monocular VO
- Testing on office sequence
- Other visual slam algorithms filter points out of the point cloud if the points stop being consistent
- Need to understand why so many points are getting in in the new college sequence
Android
- Took a brief look
- Will do some changes in OpenCV to fix some issues that will make building on Android easier
- This will enable simpler Android
3D model capture
- Trying to scale up BiGG
- Needed for scaling, don’t know where the resources may be
- Just a “head’s up”
Data collection
- Will go massive hopefully with pan-tilt table
Latent SVN
- Can load model and do a prediction using c version of his code
- 5 second per image
- original Felzenschalb was about 2 seconds
- Marius converted Felzenschwalb’s cascade from Matlab to Octave
- Doing per image.
- Cascade algorithm could be parallelized
- His technique is fairly general and doesn’t have to depend on HOG. We might try using BiGG
Server
- Can expense a new server dedicated to object recognition.
Plans
- Gary on vacation Aug 14-23
- ECCV Sept 5-11 (Gary)
- Graph Con Sept 20 (Victor)
- IROS Oct 18-22 (Gary)
- Maybe, end of Sept, Victor comes out
- Texture descriptor work right now
Anatoly
Accomplishments:
- OpenCV GPU module
- Updated OpenCV GPU API according to Joe’s comments.
- Finished the API implementation. Now we can focus on algorithms and optimizations.
- Added 3 tests cases for opencv_gpu module (GpuMat::convertTo, GpuMat::copyTo, async calls tests)
- StereoBM_GPU:
- Added low textureness based disparity filtering.
- Fixed StereoBM kernel crash that was found during textureness based filtering integration in OpenCV.
- Implemented first version of BP on GPU. It provides 20-25 times speed up on Fermi comparing with Felzenswalb’s CPU implementation. It can run only on 640×480 now due to high memory usage and shows # 4fps on the resolution. Main bottle neck is huge memory transfers. We have several ideas of optimizations.
Plans:
- Continue work with BP GPU implementation: performance optimizations, trying to minimize memory usage in order to run on high resolutions.
- Compare quality of the algorithm with others stereo matching algorithm implementations.
- Probably investigate and implement “A Constant-Space Belief Propagation Algorithm for Stereo Matching” (http://vision.ai.uiuc.edu/~qyang6/)
This is a variant of BP algorithm that works with sparse belief vector, therefore demands less memory.
Victor
OpenCV:
- Still several test fails show up for Ubuntu32 with gcc# 4 as well as for Windows mingw.
- Python compilations and tests problems persist for mingw too.
- Rosbuilder (compilation of all ros packages depending on opencv, cturtle distro is used for now as latest is broken) is finally up and compiling successfully as well as various flavours of Ubuntu and gcc. Compilation issues (trac #490, #495) have been resolved. [Alexander]
Object recognition:
- 6 new objects have been added to the benchmark dataset
- (tilex, milk, cascade, numi, pledge, spam).
- Code refactoring of textured_object_detection package.
- Statistics of experiments with object recognition for SIFT/SURF, ratio test and varying other parameters is attached.
- So far ratio test was tested with thresholds 0.6 and 0.8 and we see a drop in detection rate accompanied by a drop in false positive rate. The full table is attached. [Alexander]
Note: there is a difference in accuracies reported in my previous report and in several subsequent Alexander’s letters. This is caused by a difference in accuracy definitions.
- In the former case we tried to classify a test image into several classes and the class that got the most inlier won.
- For the latter case we solved a detection problem of finding all object instances in a test image and so any cluster with a sufficiently high number of inliers was recognized as an object, which led to lower accuracy numbers.
Features_2d:
- calonder descriptor integrated, algorithmic and timing tests comparing ROS and OpenCV versions of calonder_descriptor have been implemented.
- Other changes coming from Patrick’s request (windowedMatchingMask, index(), matchImpl()). Maria
Other: Victor was on vacations July 20-26
Vadim
Progress:
- optimized VNG-based image demosaicing using SSE
- The running time on 640×480 image decreased from ~16ms down to ~6ms – about 3x performance increase.
color conversion
- The running time on 640×480 image decreased from ~16ms down to ~6ms – about 3x performance increase.
- rewrote all the color conversion functions in C++, added some new functionality:
- new color conversion codes RGB2HSV_FULL, RGB2HLS_FULL (where h varies from 0 to 255, not from 0..180) and
- the reverse ones RGB2Lab, RGB2Luv now treat the input RGB as the most popular sRGB space, and apply the necessary gamma correction the problem was reported long ago by G. Kloss).
- The original conversion is still available via LRGB2Lab and LRGB2Luv codes. Despite the extra gamma correction,
- the performance of RGB→Lab conversion has been greatly improved.
- RGB2YUV conversion added.
- The internals of the new color conversion engine are now more flexible, so cvtColor can be extended in the future to handle various RGB profiles (like sRGB, Adobe RGB etc.)
Interpolation
- A useful cubic spline interpolation function has been added for very fast approximation of various functions (like gamma correction)
- all the tests pass.
Bugs
- The related to color conversion tickets ## 328, 450 have been closed.
New inline stereo
- studied 2 papers on trinocular stereo rectification:
- “An Efficient Trinocular Rectification Method for Stereo Vision” by Y. K. Baik, J. Choi and K. M. Lee.http://cv.snu.ac.kr/newhome/publication/pdf/conf/ic050.pdf
- “Trinocular Rectification for Various Camera Setups” by M. Heinrichs and V. Rodehorst.
http://srv-43-200.bv.tu-berlin.de/publications/pdf/P_0# pdf - Neither of the algorithms will work on our configuration, since ours is “collinear stereo” where all the camera optical centers are on the same line (if they have the same lenses).
- On the other hand, it appears that the existing cvStereoRectify can be modified to handle such a case.
- The current algorithm first makes the optical axises of both cameras parallel, and then rotates both cameras by the same angle to make the optical axises orthogonal to the baseline.
- The first part can be extended to bring the third camera axis parallel to the first two.
The prototype implementation is ready. It would be very useful to have some data to test the algorithm before putting it to SVN.
Action Items
Gary
- Will collect data
- Try to collect with pan-tilt table
Vadim
Victor
Kurt
- Get higher res data to Victor … this time for sure!
- Get Vadim 3 inline camera data
From last time
Gary
- Ask James about how well the 2D bar code is matching.
- (./) Ask Joe about GPU priorities?
- (./) Ask Steve about purchasing a new server
Vadim
- Think about is there an easier way to temporarily enter documentation changes
Victor
- (./) Send short written description of the texture feature algorithm and which parts are fast or slow.
- (./) Send quote for additional server
Kurt
- {?} Try VO on New Collage sequence.
Agenda
- Spherical and cylindrical projections
- Features2D
- Object recognition
- OpencV on Android
Minutes
- Spherical first pass is done
- Cylindrical: y is not modified.
- What is the final purpose from this function
- Pose estimation
- 3D reconstruction
- Display
- Feature based object detection and tracking
- — it’s for display here
- With 3 cameras, use middle one’s
- Working on general pictorial 2D barcodes Gary
- Descriptor matcher is a good reference example for features 2D
- Projection stuff in its own pipeline?
- Calib3D
- Image stitching
- Projection and Stitching
- Use keywords
- Projection and Stitching
- Question remains of where to put this
- Features2D depends on image processing
- Dependency graph. Uses Features2D and image processing
- Android port, look at this week
*GSOC, Vadim’s people- Qt bindings for HighGUI
- Save currently displayed image
- Zoom, scroll, measure pixel values
- Flexible controls
- is partially in HighGuI already
- Trackbar, address of variable
- Qt bindings for HighGUI
Vacations
- Gary on vacation Aug 14-23
- Vadim out Aug 9-23
- Victor out to Aug 19
- ECCV Sept 5-11 (Gary)
- Graph Con Sept 20 (Victor)
- IROS Oct 18-22 (Gary)
- Victor on vacation
- Vadim Monday
Vadim
Here is the progress made last week:
- 8 more trac tickets have been closed: ##295, 296, 310, 321, 348, 396, 408, 501
- By request from WG (Kurt, Gary) implemented a draft version of spherical projection function (when the image from a calibrated camera is projected to a sphere rather than a plane). See below
Vadim: start of spherical projection
We have a point
(u,v)on an image (let’s assume, it’s undistorted already).
From the camera matrix we compute
(x, y, 1) = ((u – cx)/fx, (v – cy)/fy, 1)
We find intersection of a ray
(X, Y, Z) = k*(x, y, 1)with a sphere
X**2 + Y**2 + (Z – alpha)**2 = 1(alpha is the distance between the center of the sphere and the camera center assuming that the radius is 1) i.e.
we get
(kx)**2 + (ky)**2 + (k - alpha)**2 = 1.
from that we find
k
then we project the intersection point from a sphere back to plane using orthogonal projection:
http://www.uwgb.edu/dutchs/structge/sphproj.htm
stereographic projection, as well as cylindrical projection, is not difficult to implement either, I just wanted to make sure that this is what you need first.
i.e. the forward mapping
(u, v) → (kx, ky)is easy to compute analytically. The inverse mapping (needed for cv::remap() ) can not be expressed analytically (I think), but Newton-Rapson method appears to compute it in 2-3 iterations in average.
-
- Figure 1: Spherical projection with alpha = 0
-
- Figure 2: Spherical projection with alpha = 0.5
-
- Figure 3: Spherical projection with alpha = 1
Comments on This
From the reference you give, translating to image-based terms, I think:
- Gnomic projection → planar projection
- Equiangular projection → equirectangular panoramic image
At any rate the other projections are interesting. I kind of like the
orthographic projection, so let’s keep that to work with. But I think
the standard projection is the equiangular one (for panoramic imagery)
See these references:
- http://mathworld.wolfram.com/CylindricalProjection.html
- http://mathworld.wolfram.com/EquirectangularProjection.html
Typical panoramic viewers expect cylindrical or equirectangular images.
I especially recommend this small tutorial:
- http://www.cambridgeincolour.com/tutorials/image-projections.htm
Cylindrical projections look good for modest vertical FOV (e.g., 45
deg); otherwise the spherical projection is probably best.
- Note that Mercator is a compromise between cylindrical and spherical.
- The planar projection looks odd past 120 deg or so.
- Note also that they stitch together three photos with approximately the same FOV as we
have with the 3-cam setup, and show a cylindrical projection which looks nice.
Ok, action item: generate equirectangular warp with variable viewpoint
(from the center of the sphere to close to the surface).
Action Items
Gary
- 2D Bar codes
Vadim
- Look over android-opencv
- generate equirectangular warp with variable viewpoint (from the center of the sphere to close to the surface).
Victor
Kurt
Get back to Vadim on spherical projection effort.
From last time
Gary
- Will collect data
- Try to collect with pan-tilt table
Vadim
Victor
Kurt
- Get higher res data to Victor … this time for sure!
- (./) Get Vadim 3 inline camera data
Agenda
- Documentation
- Spherical stitching
- TOD — Textured object detection
- 3 camera stereo
Minutes
- Features2D:
- OpenCV transition done
- Ratio test in a separate matching method
-
FAST detector a bit different in OpenCV
- - Maria to check, Helen to send info
- Object Detection:
-
FLANN seems to work differently, reversion of performance in TOD
- - Migration of ROS FLANN to OpenCV? Check with Marius
- TOD baseline: single-object matching [Alexander]
-
FLANN seems to work differently, reversion of performance in TOD
- Other detectors
- PAS detector (Schmid et al.) and Oriented chamfer matching (Shotton)
- Scaling linear in oriented chamfer matching
- GSoC (Chatfield) – color categories
- 3-cam stereo
- Working on algorithm Vadim
- - Disparity used to get pixel transform
- - Target date: end of this week
- Working on algorithm Vadim
- Spherical calibration
- Do some work looking at papers and designing an algorithm
- Send papers to Vadim [Kurt]
- Issues:
- Finishing off projects and app pipelines in OpenCV
- - good work on Features2D, should be a template for other apps
- - FLANN reversion in TOD
- - documentation
- Camera coordinate system in OpenCV / Compare to Hartley/Zisserman and OpenGL
- Finishing off projects and app pipelines in OpenCV
Action Items
Gary
- Talk to Marius about migrating ROS FLANN to OpenCV FLANN
Vadim
- Review techniques for spherical calibration
Victor
- TOD baseline: single-object matching [Alexander]
Kurt
- Ping Pascal Fua about megapixel stereo
- Send spherical calibration papers to Vadim
From last time
Gary
Vadim
Victor
Kurt
- Get higher res data to Victor … this time for sure!
- (./) Get Vadim 3 inline camera data
Agenda
-
Context:
- Overview of recent focus: Distribution centers
- Critical importance of object recognition
- Overview of recent focus: Distribution centers
-
Object recognition:
- Victor’s arrival
- recognition_pipeline
- object_recognition
- textured_object_detection (TOD)
- Features2D status
- object_recognition_experimental
- BiGGPy
- binary_pairs
- Other features, detectors
- Status of PAS (Schmid), Oriented chamfer (Shotton), Color Categories (Chatfield)
- object_recognition
-
Calibration
- Spherical calibration progress report
- Wide angle calibration
- 3-cam stereo
-
GSoC Aftermath
- Status/plans to integrate
- New HighGUI
- Android port
- Panoramic stitching
- Detectors
- Others?
- Status/plans to integrate
-
Documentation:
- Need to make a plan to absorb and then document.
- Foundation and/or non-profit status for OpenCV
Minutes
- Described distribution job
- Object recognition
-
TOD
- Dataset with images of different resolutions
- Add more objects
- Add objects from Cannon dataset
- Compared resolution results. Recognition for HD is better
- Take calibrated stereo pair, record scan of rotated object (written by Romain and supported by Radu, now this package is gone).
- PAS, Oriented Chamfer, Colored Features (and wrappers to PASCAL VOC. Cascaded HoG not done … trying to finish)
- Felzenshwalb is in the pipeline
- Dataset with images of different resolutions
-
TOD
- Calibration
- 3 camera is ready, being debugged right now. 3rd camera scale is too magnified. Debugging ~80%
- On wide angle, try (x,y) distortion model, don’t use polynomial since wide angle has too large of error there and polynomials don’t handle this well
- Want fractional function to represent the distortion Poly/Poly might work better
- Might need different calibration — big chessboard with lots of corners.
- Might make sense to make a partial view chessboard detector
- GSoC aftermath
- HighGUI QT is now in trunk :-)
- VOC is important, to make it nice to allow people to run on PASCAL very easily
- Color features (SIFT features over 3 channels and combine into higher dimension descriptor)
- Android port, now done with cmake
- Image stitching
- Documentation
- Getting behind
- Next openCV release in October. Need to find a code stop point.
- Victor Sept 22nd→NVidia GTC. Willow Sept 23rd. to Oct. 14th
Victor
Object detection: a problem with low accuracy under cturtle was fixed by adjusting the ratio threshold (the problem was caused by flann returnning wrong L1 distances in the earlier versions of ROS). All unnecessary dependencies, namespaces and headers were removed from textured_object_detection package. A tool for visualization of inliers for a pair of train and test samples has been implemented. solvePnPRansac has been parallelized using TBB. The speedup on 4 HT cores Intel Core i7-960 # 2GHz, 12Gb DDR3 is # 5×.
The current training base (640×480 images shot by Radu) has been enlarged up to 18 objects and merged with the canon dataset (9 objects, 1280×853, taken with a couple of 40D cameras) resulting in a dataset of 27 object in the training set and 22 in the test set (there were few test images in the canon dataset so we had to take new ones, and 5 out of 9 objects will be available only tomorrow). The recognition results for the new and old bases are summarized in the attached pdf file. [Alexander]
A proposal for adjusting features2d interface has been developed. A new interface incorporates all older functionality together with matching against several images, ratio test and cross-check test. The proposed interface will be sent out after team review on Wed. Maria
Mono vslam: A problem with lots of points appearing behind the camera along its trajectory has been solved — goodPts flag was not taken into account by voSt engine. The current problem is loss of tracking in a certain place of new_college. Changes of ransac PnP parameters did not solve the problem, as well as filtering out matches with large distances. Repeatability of FAST on a sample image from new_college sequence (measured using sample from OpenCV) is 0.98 — so the most of the points are tracked across a video. However precision-recall curve for SURF for the same image shows much worse results: for precision 0.1 recall is equal to 0.0# This means that if we choose 4% of matches with minimum distances, only 10% among these matches will be correct. Of course these are lower numbers than what we have in posest because of windowed match. But still the lack of correct matches is the main hypothesis for an observed loss of tracking. [Victor]
Figure # Current TOD Results, 640×480
Vadim
- 3-camera rectification algorithm has been written and now is being debugged.
- Papers on wide-angle cameras calibration have been reviewed. Currently it is planned to reuse the existing calibration engine to calibrate such cameras, but the lens distortion will be represented as a rational function of (x,y) instead of a polynomial.
- Several functions from CLapack, embedded into OpenCV distribution, have been tweaked to get better SVD performance. As a result, large matrix decomposition is now ~20% faster, and the small (2×2 – 4×4) matrix decomposition is up to 100% faster.
The plans:
- finish with 3-camera rectification
- implement rational lens distortion model: undistortion + calibration functions
- check Android port of OpenCV
Anatoly Baksheev GPU
- OpenCV GPU module.
- Implemented function for BGR2GRAY, RGB2GRAY, GRAY2BGR conversions.
- Fixed bug in BP with user allocated disparity
- Implemented async versions of drawColorDisp, reprojectPointsTo3D functions.
- Implemented remap with bilinear interpolation for color images. It does not use texture references, because uchar3 type is not supported for the references.
- Stereo On GPU demo.
- Added point cloud visualization.
- Added possibility to run BP, CSBP on color images (because now we can rectify color images too)
- Found several crashes:
- Connected with context managing.
- Probably connected with host code that support ‘double’ and device code for Compute capability 1.1 that do not support it. OpenCV is compiled for all GPU architectures now.
- A crash with unknown reason.
We spent several days trying to find reason, but still without success.
- We looked into NPP. We think best way to integrate OpenCV GPU with NPP is to create OpenCV style wrapper functions in the module that will call NPP. The wrappers won’t support 3-channel images because NPP does not, but in future we may implement it. NPP binaries for all platforms are too heavy, so we are not going to include in OpenCV SVN . In order to compile OpenCV with NPP support user will have to install NPP separately and specify the installation directory in Cmake. If OpenCV is compiled without NPP, the wrappers will throw an exception. Complete list of functions to implement such wrappers will be created later.
Plans:
- Finish updating DemoCVPR (fix the crashes)
- Integration OpenCV with NPP.
- Finish studying GraphCut based methods. Decide if we are implementing them.
Action Items
Gary
- Get new scanning from Radu
- Draft OpenCV release include list
- Send Victor foils
- Make sure Victor has a place to stay on visit.
Vadim
- Draft OpenCV release include list
Victor
- Draft OpenCV release include list
Kurt
From last time
Gary
- (./) Talk to Marius about migrating ROS FLANN to OpenCV FLANN
- {*} ‘’On Marius’s “todo” list to migrate latest FLANN to OpenCV
Vadim
- Review techniques for spherical calibration
Victor
- TOD baseline: single-object matching [Alexander]
Kurt
- {o} Ping Pascal Fua about megapixel stereo
- {o} Send spherical calibration papers to Vadim
Agenda
- Gary and Kurt are in Crete, just reports today
- See Release # 2 plans
Minutes
Vadim
The last week was spent on debugging the 3-camera rectification algorithm and on implementing the extended
distortion model that handles wide-angle cameras. Both algorithms have been finished and committed to SVN:
- the new function
float rectify3( const Mat& cameraMatrix1, const Mat& distCoeffs1,
const Mat& cameraMatrix2, const Mat& distCoeffs2,
const Mat& cameraMatrix3, const Mat& distCoeffs3,
const vector<vector<Point2f> >& imgpt1,
const vector<vector<Point2f> >& imgpt3,
Size imageSize, const Mat& R12, const Mat& T12, const Mat& R13, const Mat& T13,
Mat& R1, Mat& R2, Mat& R3, Mat& P1, Mat& P2, Mat& P3, Mat& Q,
double alpha, Size newImgSize,
Rect* roi1, Rect* roi2, int flags );
has been added. It takes 3 camera matrices and the corresponding distortion coefficients, as well as
the (R[otation], T[translation]) pairs from the camera 1 to camera 2 and from the camera 1 to the camera #
Here, camera 1 is the left-most camera, camera 2 is the right-most camera and the camera 3 is the camera in
the middle. The algorithm computes such rectification transformations for each of the 3 cameras, as well as
the new projection matrices.
The function returns the disparity ratio (0 < ratio < 1), i.e. the disparity value d, computed from the
camera 1 and camera 2, should be multiplied by the ratio to get the corresponding pixel in the camera 3 view:
img1(x,y) ~ img2(x-d,y) ~ img3(x-d*ratio,y).
optionally, the function takes the chessboard corners projections for the cameras 1 and 3 – imgpt1 and imgpt3,
and then adjusts R3 and P3 to minimize the reprojection error.
The sample image is attached.
- the distortion model has been extended from
xd = x*(1 + k1*r^2 + k2*r^4 + k3*r^6) +
yd = y*(1 + k1*r^2 + k2*r^4 + k3*r^6) +
to
xd = x*(1 + k1*r^2 + k2*r^4 + k3*r^6)/(1 + k4*r^2 + k5*r^4 + k6*r^6) +
yd = y*(1 + k1*r^2 + k2*r^4 + k3*r^6)/(1 + k4*r^2 + k5*r^4 + k6*r^6) +
the functions calibrateCamera, stereoCalibrate, projectPoints, undistortPoints, initUndistortRectifyMap
and others have been modified accordingly.
The new model appears to handle wide-angle cameras much better than the previous model.
The sample images are attached. As you can see, the rational distortion model results in the straight lines in the
undistorted images, despite that the usual pinhole camera model is used, not the special fisheye lens model.
Besides, the average reprojection error dropped from 1.79 pixels to 0.47 pixels on the tested dataset.
Plans:
- try & integrate Android port of OpenCV
- continue to clean the bug tracker
- start revising the documentation
See
Victor
Object recognition: the algorithm for automatic creation of a training base from a bag file or a set of stereo pairs has been created. We follow the ideas from 1 iteratively adding samples that are recognized poorly to the training base. The test base had also to be re-created in order to eliminate crossing of training and test sets. Experimental results are on the way. [Alex]
Features2d: we did several iterations on the interface for matching an image vs several images. The main issue is how to convert a global descriptor index (descriptors from different images have to be stored together in a single matrix for search with flann) to image and keypoint indices. The implementation is is progress together with usage samples, that will be reviewed by the team this week. Maria
Mono visual slam: several bugs have been fixed, a full circle of new_college is tracked now. The main issue is in accumulation of camera trajectory errors and instead of a cycle we get a spiral. The test system has been modified to allow more general camera trajectory and the new_college scene has been simulated as a flat ring of points and a camera circling above them. However the problem observed on real data is not reproduced on simulated (see the image below). [Victor]
Shape recognition: a clustering algorithm for creating object taxonomy from a confusion matrix has been created. We follow the idea of 2, iteratively merging object classes that are mostly confused with each other. An example of a taxonomy is shown here. This will become essential as we scale up the number of objects. This is observed, for example, in 3 where they deal with 10K object categories. [Ilya]
Other: Alexander was on a sick leave for the most of the week.
Best Regards, Victor
1 Shape matching and object recognition using shape contexts, Belongie, S. and Malik, J. and Puzicha, J., IEEE Transactions on Pattern Analysis and Machine Intelligence, 2002
2 Learning and Using Taxonomies For Fast Visual Categorization, Gregory Griffin and Pietro Perona, CVPR 2008
3 What does classifying more than 10,000 image categories tell us?, Jia Deng, Alex Berg, Kai Li and Li Fei-Fei, ECCV 2010
CUDA Anatoly Baksheev
Accomplishments:
- Most time we spent trying to find reasons or workaround for the crash described before (or stereo kernel launch failure with unknown error). We worked in following way: we got whole OpenCV and began excluding code from it step-by-step while crash was reproduced. If crash disappeared we reverted changes, and continued excluding another code. As result we have 2 little samples that demonstrates the crash. We submitted this bug on NVidia extranet (partners.nvidia.com).
- OpenCV GPU module.
- meanShiftFiltering was updated to return output in RGBA format. We started updating the module in order to support color in RGBA by all algorithms.
- Done small refactoring in the module and corresponding tests.
Plans:
- Integration OpenCV with NPP.
- Finish updating Stereo Demo
Action Items
Gary
Vadim
Victor
Kurt
From last time
Gary
- Get new scanning from Radu
- (./) Draft OpenCV release include list
- Send Victor foils
- Make sure Victor has a place to stay on visit.
Vadim
- (./) Draft OpenCV release include list
Victor
- (./) Draft OpenCV release include list
Kurt
Agenda
- We missed 2 weeks due to vacations and conferences, see ’’Prior Reports:_below
- Finalize what is in the release
Working idea’s list:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- :\ Reorganized documentation (that reflects the new OpenCV structure)
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- :\ Python support extended to the new C++ functionality & MLL
- Need to look at pyopencv
- This might be easiest
- (./) Android support in the build system, (and optionally iphone)
- :\ More bugfixes (reported in the bug tracker).
- (./) Finish features2d interface for one to many image matching (in progress by Maria)
- :\ Include interface to PASCAL Visual Object Challenge and bag of words matching done by Ken Chatfield
- :\ Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- Release date is mid to end of October
Victor will be onsite Sept 23rd. to Oct. 14th
Minutes
- Android — part of opencv in Android.
- Python is not built because it runs through Java
- HighGui is only partial (camera support is from Android SDK)
- Features2D
- Needed to change dmatch structure to allow for 1 to many or 1 to 1
- Need to talk to Patrick to make sure he’s OK with this, until now it is in branch
- dmatch now has “imageID” to allow many to 1
- Purpose is when you have several different images for an object class (object class recognition)
- Question is where is Features2D used? VO and object rec stack
- Maria is working on the PASCAL VOC interface (code from Ken Chatfield)
- Also has bag of words for recognizing object categories from descriptors
-
TOD + dense stereo (currently in ’’textured_object_detection’’)
- I can scan them
- Send bag files to Alexander — get rid of stereo optionally
- Change pose to homography
- It’s in object recognition stack
- It is not in REIN
- Put an object on a table, use dense stereo
- Want to add use with selectObj3D
- latentSVM now ratified and on track for integration.
- Sliding window based search
Vadim
- With the help from Ethan Rublee added Android support to OpenCV. That includes build scripts for Crystax NDK (http://www.crystax.net/android/ndk-r# php), Swig-based Java wrappers for OpenCV and the sample code. The current build system does not use CMake, but all the makefiles reside in a separate subdirectory opencv/android, and thus do not affect the rest of OpenCV. Also, the NDK makefiles for all OpenCV modules, except for highgui, have been rewritten not to require manual editing when new files are added or deleted to OpenCV.
- Fixed couple of bugs in the recently optimized lapack subset that caused test freezes.
- Rewrote matrix expressions to avoid templates use, moved it from headers to the source file matop.cpp. While no measurements have been done yet, subjectively it decreased OpenCV build time quite a bit.
- Added Google test framework googletest (http://code.google.com/p/googletest/) to OpenCV. Started implementation of a sample test based on this framework.
Plans:
- implement a few tests using googletest.
- start documentation reorganization
- start Python wrappers extension to include the C++ part of OpenCV.
Victor
-
OpenCV:
- buildbot maintenance. Lots of problems arose that were solved by rebooting the server (Mac Pro) as well as all virtual systems (usually they are not rebooted, they just go to suspended mode all the time). Modified scripts for buildbot (fixed some problems with incorrect svn update). Found and removed failed tests from OpenCV (issues #564, 565, 566, 567, 568, 569, 570, 572, 575, 576, 577, 578, 579). Most of these problems related with MinGW compiler under Win3# Increased size of RAM for virtual machine with Windows XP. [Alexander]
- _TOD: _
- Discovered an issue with pose estimation related to incorrect poses of piecewise-planar objects with features found only on one plane. Then if the object is symmetric, solvePnP returns incorrect pose. A new version of the training base with manually selected objects was created to get more accurate point clouds [Alexander]
-
Features2d:
- implemented algorithmic tests on FeatureDetector and DescriptorExtractor. A sample using a new interface for matching a single image to a set of images was implemented. Fixed a problem with read/write keypoints Maria
-
Shape recognition:
- Experimented with COIL100 dataset ( http://bit.ly/9qREPH ). Chamfer detected 50% of objects, PAS – 28% and combined classifier – 61%. Low results can be explained partly by a lot of objects that are indistinguishable by shape (e.g. green Bulgarian pepper and red Bulgarian pepper). [Ilya]
-
Other:
- Victor gave an invited talk at Graphicon’2010 on new features of OpenCV. Also Victor, Kirill and Anatoly gave an invited talk about stereo on GPU. Victor is on the way to US now.
Last Week Reports: Victor
Object detection:
The training base built by an automatic algorithm results in higher accuracy than the manually created training base. In order to avoid train/test overlap, we used odd frames from a bag files for training set, and even — for the test set. The amount of images chosen for the training set for each object is shown in the table below.
Name Views count in new base Views count in old base 100tea 40 10 all 63 8 bp 21 10 coke 29 12 cs 14 10 gdo 43 7 gt 9 9 gtwl 38 9 jp 15 8 naked 33 10 pc 21 10 rrtea 9 13 ts 9 4
Recognition results are in the table below. There are 200 images per object, OLD and NEW correspond to the old and new training bases, the first number in the column is the amount of images with correct detections, the second is the number of false positives).
OLD NEW 100tea 131 2 198 2 all 148 3 198 0 bp 151 0 200 4 coke 178 1 198 2 cs 198 1 199 1 gdo 151 3 198 0 gt 193 3 198 0 gtwl 177 2 200 0 jp 188 3 200 1 naked 173 1 198 0 pc 187 4 198 1 rrtea 198 3 198 1 ts 183 2 199 0
The accuracy is so high because even views in a bag file are like odd views that were used to create a training set. However the detection rate on other bag files containing several objects at once, is also higher for the new base. Below is a table for a bag file with 826 views and three objects, showing the number of correct detections:
OLD NEW
Numbers of views with all object 520 750
Numbers of views with gdo object 400 654
Numbers of views with gt object 675 741
False positives 1 1
The video demonstrating detection on this bag file is uploaded here:
- http://www.youtube.com/watch?v=Jq3w8_XGjXc , or:
- http://www.youtube.com/watch?v=R-GscZPp070 .
Colored set of points indicate a successful detection of an object. [Alexander]
Features2d:
Another iteration over the interface of one-to-many image matching. A sample for this interface has been added. Minor changes to DMatch structure are proposed. Maria
Shape matching:
Several improvements in the PAS matching algorithm fused together with chamfer matching has resulted in high detection rate for textureless stapler (11 detections out of 20 stapler images with only one false positives per 160 non-stapler images). Need more textureless objects to experiment with — we are looking at COIL100 dataset. Also, we discovered a CVPR 2010 paper on chamfer matching that claims to speed it up by a factor of 100. [Ilya]
vslam:
a lot of time spent on a merge of cturtle_trunk version into stacks/vslam version. There still are some subtle differences but they are negligible. [Victor]
Action Items
Gary
Vadim
- Add build instructions to Android so that Alexander can add to build bot
- Renew Visa
Victor
- Send email on … I forgot what you were trying to say to Kurt and I when skype was garbled
Kurt
From last time
Gary
- (./) Get new scanning from Radu
- (./) Draft OpenCV release include list
- (./) Send Victor foils
- (./) Make sure Victor has a place to stay on visit.
Vadim
- Draft OpenCV release include list
Victor
- Draft OpenCV release include list
Kurt
Agenda
- Finalize what is in the release
Working idea’s list:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- :\ Reorganized documentation (that reflects the new OpenCV structure)
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- :\ ’’Active work_Python support extended to the new C++ functionality & MLL
- Need to look at pyopencv
- This might be easiest
- (./) Android support in the build system, (and optionally iphone)
- :\ More bugfixes (reported in the bug tracker).
- (./) Finish features2d interface for one to many image matching (in progress by Maria)
- :\ Include interface to PASCAL Visual Object Challenge and bag of words matching done by Ken Chatfield
- :\ Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- (./) 3 inline camera stereo.
- Needs discussion with Kurt about when the camera is too far out of left<=>right image planes. Then there is a problem with scale factors
- Release date is mid to end of October
Victor is onsite to Oct. 14th
Minutes
- Vadim: Python bindings for C++ (keep James for C)
- Difficulty is C++ automatically allocates the arrays, but Python must know their size
- Implemented a conditional allocater function. If the pointer is null, traditional malloc a free is used
- Else use custom use compatible with numpy
- Gain the ability to use numpy automatically — will send a report later
- In python, don’t have to worry about allocation of output arrays, it will be automatic
- Implemented a conditional allocater function. If the pointer is null, traditional malloc a free is used
- Probably spend another week on this to extend to the rest of C++
- Difficulty is C++ automatically allocates the arrays, but Python must know their size
- Android: Build system changed to CMake
- Ethan is heading out now to be a Willow Intern, so will be online next week, Oct 4th.
- Only image load and save is implemented in HighGUI
- Pascal VOC
- Bag of words in
- Code needs re-write. Couple of days
-
GSOC
- In # 2
- Android
- Improved HighGUI, now using QT
- Later
- Object category recognition and detection, with application to automatically work with PASCAL VOC
- Easy reading of PASCAL data
- Color descriptors and bag of words
- Calculation of ROC curves
- Maria is working on refining this code for inclusion
- Spherical image stitching (maybe in October release if Ethan gets in shape)
- Image blending
- {X} Cascaded HOG … but Intel is doing this, might get from them
- Blob tracking (not with Features2D) ’’by_Dat Chu (Nicolas Saunier was mentor)
- but need to simplify API, get rid of Boost dependency
- Trajectory analysis (Nicholas)
- Code is on bitbucker, have to ask Nicholas about the status
- Region Covariance Descriptor Stephen McKeague under Mark Asbach
- May go into samples, now in
- http://bitbucket.org/mark.asbach/opencv-covariance-features
- May go into samples, now in
- Object category recognition and detection, with application to automatically work with PASCAL VOC
- Victor relationship with NNU (Nihzniy Novgorod University)
- Might do an object recognition with OpenCV course
- Gets 3 students who would work on OpenCV related things
- Solve PnP improvement, try new solvers by LePetit’s group
- Object scanner: Blob based calibration pattern.
- Help with above. Alternative algorithm for stereo correspondence, ray intersection, trinocular stereo
- In # 2
Victor
-
Object detection:
- Captured 10 objects for binpicking project and created a training set for them.
- Recognition was tested on three objects (clean_concept, coffee_filter and can_opener), the system is capable of recognizing all of them.
- Pose estimation was tested on 8 test bags (images of each individual bin with lots of objects of the same class, captured by Matei) and the results are promising.
- Objects with more texture such as can_opener are recognized robustly (as far as we can tell from a single image — meaning the recognition is robust against algorithm parameters changing), others such as egg_poacher are not so robust.
- The training base loading has been speeded up by precomputing descriptors and storing them in an xml file.
- Fixed a problem with incorrect camera parameters. [Victor, Alexander]
-
Shape recognition:
- Implemented the first version of Fast Directional Chamfer Matching (FDCM) without algorithmic optimizations.
- Evaluated FDCM on the COIL100 dataset. It detects 75% of objects and it outperforms our previous detectors
- (PAS – 47%, Chamfer matching – 50%, PAS+Chamfer – 61%).
- Implemented a simple framework to test FDCM on the ETHZ Shape Classes dataset or other similar datasets. It scans an image by a sliding window, performs non-maxima suppression of hypotheses, evaluates detections by Intersection-over-Union criterion, visualizes detections and plots results as false postitve per image vs. detection rate.
- Experimented with FDCM on the ETHZ Shape Classes dataset. The current implementation has low recognition results on this dataset. [Ilya]
-
features2d:
- Integrated bag of words and color descriptors from Ken Chatfield’s code into features2d. Maria
-
OpenCV:
- Installed Mingw TDM for experiments. Plan to test OpenCV under it, maybe will change mingw on virtual machines to TDM. [Alexander]
-
Other:
- Victor gave two invited talks at Graphicon 2010 (OpenCV and stereo) and a talk (together with Radu and Joe Stam from nVidia) on stereo with GPU at nVidia GTC. Lots of
Vadim
The whole last week was spent in experiments with Python bindings. Below is a short ’’"technology preview"_of the new bindings:
- I use the existing wrappers created by James, and added the new functions that call C++ API, so the new functionality will be available side by side with the existing C API wrappers.
- The new bindings are only available when Numpy is installed. This limitation can be removed in the future.
- Thus, no cv::Mat/cv::MatND etc. are mapped to Python. Instead, numpy arrays are directly processed by OpenCV.
- OpenCV multi-dimensional dense arrays ( cv::MatND, cv::Mat ) are binary-compatible with Numpy arrays, so input numpy arrays can be converted to cv::MatND/cv::Mat without copying data, just the OpenCV headers need to be constructed.
- Similarly, the output arrays, when they are of correct size and type, can be easily handled. However, when OpenCV needs to reallocate the output array using Mat[ND]::create , we need some sort of callback to reallocate array using Numpy API.
- Such a callback has been added as the new Mat[ND]::allocator member. When it is not NULL, allocator→allocate() and allocator→deallocate() are invoked to allocate/deallocate the array data, otherwise the normal OpenCV functions are used.
- Because in C++ API we can have several overloaded variants of the same function, like cv::add:
cv::add(Mat, Mat, Mat&);
cv::add(Mat, Scalar, Mat&);
cv::add(Mat, Mat, Mat&, Mat mask);
cv::add(Mat, Scalar, Mat&, Mat mask);
cv::add(MatND, MatND, MatND&, MatND mask);
cv::add(MatND, Scalar, MatND&, MatND mask);
- I have to implement wrappers for such functions by hand, where the wrappers choose the correct OpenCV function based on the parameter types.
- By now I have implemented the basic functions from core: arithmetical, logical and statistical operations, linear algebra functions, dft, drawing operations etc.
Plans:
- implement wrappers for the most important functions from other opencv modules.
- extend wrapper generation script to support the new API and, probably, C++ classes as well.
Anatoly Baksheev
Accomplishments:
- NPP integration – switched to Cuda # 2
- Updated build system according to changes in NPP for Cuda# 2 and Cuda toolkit, Added search in $CUDA_NPP_ROOT.
- Verified most bugs with NPP (now some functions work fine, ex. copyMakeBorder). Prepared samples and submitted 4 bugs of NPP. Several issues are being investigated.
- Implemented via NPP:
o graphcuts, , boxFilter, cvtColor (RGB <→ YCrCb, BGR5x5 <→ BGR, BGR5x5 <→ Gray), exp, log, magnitude.
o Implemented overloads with cv::Scalar of functions add, subtract, multiply, divide, absdiff (new NPP features) - Test were updated and refactored.
- Stereo DoubleBP – meanShift & BP based stereo matching algorithm.
Now we have working prototype. meanShift & BP are run on GPU. All other pars on CPU. Our implementation now takes 43rd place on Middlebury rating. In contrast with authors’ implementation that has 3rd place. (But difference in percentage of bad pixels is not more than 8%).
We are going to investigate results from it, may be will tweak parameters, maybe we missed something. The algorithm is very slow (2min/frame) now because of implementation. It could be made much faster on CPU, some part could be ported to GPU.
- Stereo based on GraphCuts.
New NPP has GraphCut labeling function. So now we can very quickly implement stereo matcher that uses it. We started doing it and have finished approximately 50% of work.
Fortunately, we have not started its implementation before, because in this case Itseez and NPP teams would do the same work.
Plans:
· Continue NPP integration (left: canny, histograms, some filters and some issues)
· Work with DoubleBP and StereoGraphCuts.
· Update Stereo Demo, if our ‘megabug’ is fixed, or prepare another repro cases.
Other:
We was on GraphiCon conference 21 – 23 sep 2010.
Action Items
Gary
- Ask Nicholas about the status of Trajectory Management
Vadim
- Write up python report and send to James and myself
Victor
Kurt
From last time
Gary
Vadim
- Add build instructions to Android so that Alexander can add to build bot
- Renew Visa
Victor
- Send email on … I forgot what you were trying to say to Kurt and I when skype was garbled
Kurt
Agenda
- Track the release
- OpenCV # 2 ideas
- On schedule?
- Code freeze when?
- General status
- If Victor calls in, we can talk some about foundation status for OpenCV
- Post release, should discuss migrating the ROS Recognition Infrastructure to OpenCV. Want an easy way to use different classifiers etc.
- Status of QT/HighGUI
- Docs
- Doc status
- User annotate
- List of what needs docs
- More and more useful example code
Working idea’s list:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- :\ Reorganized documentation (that reflects the new OpenCV structure)
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- :\ :\ ’’Active work_Python support extended to the new C++ functionality & MLL
-
Need to look at pyopencvNo, Vadim is writing a universal Python
-
- (./) Android support in the build system, (and optionally iphone)
- :\ More bugfixes (reported in the bug tracker).
- :\ Finish features2d interface for one to many image matching (in progress by Maria)
- :\ Include interface to PASCAL Visual Object Challenge (VOC) and bag of words matching done by Ken Chatfield
- :\ Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- (./) 3 inline camera stereo.
- Needs discussion with Kurt about when the camera is too far out of left<=>right image planes. Then there is a problem with scale factors
- Release date is mid to end of October
Victor is onsite to Oct. 14th
Minutes
- Vadim working on modifying OpenCV headers with extra markers so it will know which parameters are output and input to automatically parse python
- This is one step further from James to generate headers automatically. His parser created a description from C headers but need to manually edit a file
- New technique automates this
- This is one step further from James to generate headers automatically. His parser created a description from C headers but need to manually edit a file
- Documentation
- Doxygen has (had?) problems with overloaded function.
- Doxygen is actively developed, maybe they are slow on bug and features, but they are releasing code
- Maybe we should have links hard coded in to user bugs/notes page
- Need to add example documentation in samples
- Use frames(?)
*
- Use frames(?)
-
PASCAL VOC
- Bag of words and color descriptor integrated by Maria
- In progress, hopefully next week to integrate the PASCAL VOC
- Latent SVM
- Some commits from NNU, have to see what their status is
-
GPU module
- Issue: Changed the way we call the code through CUDA.
- Found a bug in CUDA
- Memory gets overwritten, but shows only when linked to OpenCV. boiled it down to a simple example and it’s been accepted by the CUDA team, but CUDA won’t release a fix until Nov. Hard to work around
- Found a bug in CUDA
- This runs outside of OpenCV
- Issue: Changed the way we call the code through CUDA.
- HighGUI
- If built with QT, will be extra functions for buttons, checkboxes and other controls
- Script attempts to find QT, if it succeeds it will use, if not. But had problems on some platforms to for now
- User needs to explicitly enable QT in CMake “use_QT” or “with_QT”.
- If this behaves well, we will turn on by default
- User needs to explicitly enable QT in CMake “use_QT” or “with_QT”.
- Want to make contacts with Google streetview and Android teams.
- Streetview: Want stitching to work with their dataformats
- Android: Want OpenCV to work well on ARM (Atom is probably OK), good infrastructure for VR, tracking, calibration …
-
Schedule:
- 1 week more of Python
- 1 week of doc/bug fixes
- 1 week testing
- Maria free after a week
- Alexander needs to be on TOD
- Victor is onsite until Oct 14
- Ilya might be able to help with html
Vadim
The short summary progress:
- fixed couple of bugs in cv::rectify3() and 3calibration.cpp sample. Thanks to Kurt for the test data!
- Continued working on the extended Python bindings. Wrote parser (~90% complete) for the OpenCV headers that produces the API description files, very similar to manually created by James “api” file. It is planned to modify cv.py wrapper generation script to handle the automatically generated description files for C++ API.
Victor
-
Object recognition:
- Support for prosilica camera has been implemented in TOD.
- Several objects were scanned with prosilica camera that was calibrated with regard to narrow stereo pair using a scanned checkerboard.
- The algorithm calculates descriptors form prosilica image and finds 3D coordinates of the keypoints by projecting the narrow stereo point cloud into prosilica image with the calibration transformation.
- The algorithm was tested on several objects (coffee filter, can opener) and it finds correct poses. Since algorithm parameters have a strong dependence on resolution (5MP vs VGA), they were put into a separate file. [Victor, Alexander]
-
Features2d:
- Problems with running detector/descriptor testbench (reported by Patrick) were fixed, computation of region intersection area was parallelized with TBB.
- One-to-one keypoint matching with region overlap computation is 6x faster now on a large keypoint dataset.
- Integration of Ken’s VOC sample is in progress. Maria
-
Shape recognition:
- Support for FDCM has been added into TOD. An additional score from chamfer matching helps to find correct poses of objects.
-
FDCM has been tested against the code provided by the authors and gives similar results.
- However high recognition scores on Zurich shape dataset reported in the paper are not reproduced by both versions of the code.
- A dependency on chamfer_matching (ROS package) breaks TOD (probably an older version of FLANN gets used) so this dependency has been temporarily removed. [Ilya]
Anatoly
Accomplishments:
1) NPP integration· Implemented via NPP convertTo (depth convert), sumWindowColumn, sumWindowRow, Sobel, GaussianBlur, Canny.
· Verified and updated some filters code according to NVidia comments for submitted bugs. One is not a bug, just docs misinterpretation. All work correctly.
· We are still working on some OpenCV functions, that can be implemented via NPP filtering primitives.
2) GPU module
· Added new cvtColor functionality (RGB <→ YUV, RGB <→ XYZ)
3) Double BP
· We re-implemented some parts of DBP prototype and tweaked parameters, after we got 9th place in Middlebury rating for Tsukuba (3rd place by authors).
· DoubleBP was tested on several Middlebury’s datasets (Tsukuba, Venus, and Aloe) for small resolutions. Some screenshots and info could be found here https://docs.googlecom/present/edit?id=0AZqfgw3H-ZOjZGRocHNwcDlfMjIyY3FteHp2Z2o&hl=en&authkey=CI3p_rkN). We used small resolutions because it very slow (example, up to # 5hours per frame for 950×800 on CPU + GPU).
· We realized that current most consuming part (99.9% of time, CPU) can hardly be ported to GPU effectively. But we can deviate from the paper and implement it this with less precision but faster and on GPU. May be we will do it.
4) Stereo Graph Cuts
I harried to claim that NPP could be used for this task. I started implementing the algorithm. When I reached graph generation step, I realized that in contrast with other GC based algorithms, this graph is not a regular pixel grid graph, so NPP function is useless.
Because I had done some parts, I started leaning min GC algorithms and theirs GPU implementations. Maybe I will be able to implement GC for stereo on GPU (like in NPP but with more complex data structure). But now I am not sure that it its possible.
Plans:
· Continue NPP integration (left: histograms, some functionality based on NPP filters)· Work with DoubleBP and StereoGraphCuts.
· Update Stereo Demo, if our ‘megabug’ is fixed, or prepare another repro cases.
Action Items
Gary
- Start adding documentation to examples
- Make contacts with Google streetview and Android teams.
- Streetview: Want stitching to work with their dataformats
- Android: Want OpenCV to work well, good infrastructure for VR, tracking, calibration …
Vadim
- Check on QT status in OpenCV
- Find way to add user notes to OpenCV docs. Create pages off of our wiki at http://opencv.willowgarage.com/wiki/Welcome/Support
- Visa
Victor
- Need 1 to many image matching in Features2D. Talk to Patrick about what this might break.
- Find status of interface to PASCAL VOC
- Find status of LatentSVM
Kurt
From last time
Gary
- (./) Ask Nicholas about the status of Trajectory Management
Vadim
- Write up python report and send to James and myself
Victor
Kurt
Agenda
- Track the release
- OpenCV # 2 ideas
- On schedule?
- Code freeze when?
- General status
- Doc status
- User annotate
- List of what needs docs
- More and more useful example code
- TOD
Working idea’s list:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- :\ Reorganized documentation (that reflects the new OpenCV structure)
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- Google test is integrated and works, but tests are not written in it yet
- :\ :\ ’’Active work_Python support extended to the new C++ functionality & MLL
-
Need to look at pyopencvNo, Vadim is writing a universal Python
-
- (./) Android support in the build system, (and optionally iphone)
- :\ More bugfixes (reported in the bug tracker).
- :\ Finish features2d interface for one to many image matching (in progress by Maria)
- (./) Include interface to PASCAL Visual Object Challenge (VOC) and bag of words matching done by Ken Chatfield
- :\ Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- We need external API fixed and a sample written
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- (./) 3 inline camera stereo.
- Needs discussion with Kurt about when the camera is too far out of left<=>right image planes. Then there is a problem with scale factors
- (./) Updated FLANN is incorporated into OpenCV.
- Release date is end of October
- Victor is onsite to Oct. 14th
- Gary in Taiwan next week
Minutes
- Python status
- All of opencv can be now be parsed, similar to what James did
- Fully automated what James did, which takes a manual api description files
- Vadim wrote a tool that automatically produces api description files
- Could be passed to James’s tool, but since we now have classes etc, Vadim is modifying the script to handle this more complex output
- Needs a few more days
- OpenCV now has ~2000 functions (probably 1000 distinct functions)
- Test coverage has not been tracked for awhile, tests continue to be added but we need to regenerate the statistics
- Now we have broken up cv into “stacks”, so the comparison isn’t quite the same, cxcore will be comparable
- The problem right now is with build bot, but Alexander is on TOD. He will get back to this to get it working by the release.
- Android
- Ethan may write the installation for Android
- Trying to improve the process
- We need Android # 2 (Nexus One)
- Ethan may write the installation for Android
- Dft
- mulSpectrum computes element wise computation of 2 complex arrays
- If you want power, phase, or both, use mag, phase, CartToPolar respectively
- polarToCart and merge and then idft
- For documentation, write a script to do a difference before and after on the html output to check whether things are screwed up
- We might think, post release, maybe we should move completely to sphinx since the tool chain is complex and sphinx can produce pdf and html as required
Marius Muja (FLANN – Fast Approximate Nearest Neighbor)
- Upgraded FLANN in OpenCV to version 1.5 (‘’revision 3737’’).
- Since the new FLANN is mostly templated, the header files have to be accessible for inclusion to the client code, so
-
FLANN was moved from ‘’3rdparty/flann_to ’’modules/flann’’.
- The main index class is now
cv::flann::Index_
and there is a typedef from cv::flann::Index_ to cv::flann::Index so the existing code that uses the flann from OpenCV works the same.
- The main index class is now
- The only change that needs to be done to the existing client code is adding
-
#include <opencv2/flann/flann.hpp>
in case flann is not indirectly included through some other header files (such as the old form cv.h) and linking against the libopencv_flann.so library.
-
Vadim
The summary progress:
- Finished and debugged OpenCV header parser. The parser is able to parse all the OpenCV headers, extract all the class and functions, and produce the output very similar to the manually created “api” file in the current Python wrapper. The sample output from the whole OpenCV, as well as the reference “api” from the current wrapper is attached.
- Now the wrapper generator is updated to handle the automatically extracted API data.
- In order to simplify Python wrappers code, MatND and Mat are merged into a single class Mat. Now MatND is defined as:
- typedef Mat MatND;
- The new n-d capabilities of the Mat class brought very small extra overhead space-wise (the new header is just 16 bytes larger than the previous one) and performance-wise (a single extra comparison “m.dims > 2” is added to the basic array processing functions). Also, compatibility with the old code is preserved. All the tests pass after the modification.
- Fixed OpenCV build on MacOSX after LatentSVN integration.
Victor
-
TOD:
- Implemented frame pose registration by a chessboard that is scanned together with each object.
- All 3d points are projected into a single reference frame and inliers from different frames can now exist in the same hypothesis.
- The recognition results with prosilica are only slightly better — the most of the inliers still come from a single image.
- Coarse to rough pose search strategy was implemented:
- first we generated poses from each cluster and look for inilers with large reprojection error (50),
- then each pose is refined on a new set of inliers and the quality of the pose is estimated with a number of inliers that have very low reprojection error (~8).
- Poses that result in overlapping projections of objects into a test image are filtered out — we leave the one with the largest number of inliers.
- The algorithm was tested on several items from binpicking collection (coffee_filter, can_opener, egg_poacher) and several items from household collection (tide, mop, tilex).
- The results show that the recognition works stable most of the time, while pose estimation is unstable for objects with small textured surface (perspective is too weak to estimate both distance and orientation).
- Alexander is experimenting with 3D to 3D matching for pose estimation.
- A bug in solvePnP resulting in poses behind the camera was re-discovered and submitted to Vadim. [Victor, Alexander]
- Implemented frame pose registration by a chessboard that is scanned together with each object.
-
Object category recognition:
- Finished to integrate BOW image classification sample. The main sample function is refactored,
- made the sample cross-platform, vocabulary trained.
- SVMs and plotted recall-precision curves for Ken’s sample and integrated OpenCV sample are only slightly different because of several sources of randomness in the code. Maria
-
features2d:
- Merged features2d support for one to many image matching (implemented earlier in “features2d” brunch) with trunk revision.
- Started to test these modifications using some existing OpenCV tests. The changes are not committed yet. Maria
Anatoly Baksheev (GPU)
Accomplishments: 1) NPP integration · Created filters engine – set of classes that help to create convolution kernels for different filtering operations and to execute the operations using NPP in most optimal way. · Implemented Laplacian, Scharr, GaussianBlur, sepFilter2D. All other filtering functions were switched to the engine. Tests were updated. · Implemented histogram wrappers: histRange, histEven, evenLevel. Now we are checking what other OpenCV functionality could be implemented via the histogram wrappers. · Reported NPP bug (742990 ) “nppiCanny returns NPP_TEX_BIND_ERROR”. But NVidia had known about the bug before. 2) MeanShift segmentation – integrated into OpenCV. (it was a separate part of DoubleBP before). It uses MS filtering on GPU and, after, does 5D point classification on CPU. Next it generates image with labeled segments. MeanShiftFiltering was updated to return range coordinates that are used by segmentation. Added ability to eliminate segments with size lower than given threshold . CPU part is ~25% of time and not portable to GPU. We plan to speed up several loops with SEE/TBB. Test is in process. 3) Created ROS package with StereoBM GPU code, interface and initialization infrastructure were re-implemented without OpenCV GPU. So it is stable. (NVidia said that fix of our ‘megabag’ will be available in November’s release). Victor are going to demonstrate it on a robot with GPU. The only disappointing fact is that the GPU has only 64 cores (I guess it is not a Fermi), and performance won’t be higher than on CPU. 4) Double BP. Implemented fast correlation volume calculation on GPU, now we are debugging it (returns image with artifacts). The implementation uses modified integral images based approach that we created during CVPR preparation. It is only one part of DoubleBP algorithm. Its performance is less than 1 sec for tsukuba_384x228 (~160 sec before). We need a couple of days to fix bugs and estimate quality + 1 day if we decide to integrate it in OpenCV. Plans: · NPP integration: check is we missed something. · Work with DoubleBP. · Switch to implementation of SURF or HOG algorithms.
Action Items
Gary
- Get Victor a Nexus One
- Ask Ethan about Android build scripts
- Contract with Steve
Vadim
Victor
Kurt
From last time
Gary
- Start adding documentation to examples
- Make contacts with Google streetview and Android teams.
- Streetview: Want stitching to work with their dataformats
- Android: Want OpenCV to work well, good infrastructure for VR, tracking, calibration …
Vadim
- Check on QT status in OpenCV
- Find way to add user notes to OpenCV docs. Create pages off of our wiki at http://opencv.willowgarage.com/wiki/Welcome/Support
- Visa
Victor
- Need 1 to many image matching in Features2D. Talk to Patrick about what this might break.
- Find status of interface to PASCAL VOC
- Find status of LatentSVM
Kurt
Agenda
- Track the release
- OpenCV # 2 ideas
- On schedule?
- General status
- Doc status
- User annotate
- List of what needs docs
- More and more useful example code
- TOD
Working idea’s list:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- :\ Reorganized documentation (that reflects the new OpenCV structure)
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- Google test is integrated and works, but tests are not written in it yet
- :\ :\ ’’Active work_Python support extended to the new C++ functionality & MLL
-
Need to look at pyopencvNo, Vadim is writing a universal Python
-
- (./) Android support in the build system, (and optionally iphone)
- :\ More bugfixes (reported in the bug tracker).
- :\ Finish features2d interface for one to many image matching (in progress by Maria)
#(./) Include interface to PASCAL Visual Object Challenge (VOC) and bag of words matching done by Ken Chatfield - :\ Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- We need external API fixed and a sample written
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- (./) 3 inline camera stereo.
- Needs discussion with Kurt about when the camera is too far out of left<=>right image planes. Then there is a problem with scale factors
- (./) Updated FLANN is incorporated into OpenCV.
- Release date is end of October
- Victor is onsite to Oct. 14th
- Gary in Taiwan next week
Minutes
Vadim
- Python wrapper generator is semi-completed. The code for standalone functions is generated (the debugging is in progress). The classes, ML classifiers in particular, are not wrapped yet.
- Ilya started the work on OpenCV documentation reorganization (the new documentation structure should reflect the new modular code structure).
- 2 build problems have been fixed.
- Many overload variants of OpenCV functions has been removed after special ArrayArg introduction. ArrayArg is type-safe C++ analogue of CvArr. It can be used to pass Mat&, std::vector<>, CvMat*, Vec<> or Mat<> to the function, as input or output parameter. It has been done to simplify Python wrapper generation, but for C++ users it provides more convenience as well.
Victor
TOD:
- A method for calculating object pose from stereo has been implemented and is being tested now.
- The results are comparable to solvePnP because the current algorithm allows only clusters that cover a large part of the object.
- We will try tuning various parameters to enable higher detection rate while still having robust pose from stereo.
- A set of ~25 objects has been scanned. [Victor, Alexander]
stereo_gpu:
- Several problems with using stereo on gpu function in ROS have been fixed.
- A package stereo_gpu has been created to try the method on PR2 (bugs in the current version of CUDA do not allow opencv gpu calls).
- The module was provided to Kevin Watts for benchmarking. [Alexander]
features2d:
- Worked on integration of features2d modifications.
- Finished the test features2d modification using existing OpenCV tests.
- Implemented algorithmic tests for descriptor matchers (BruteForce and FlannBased).
- They cover all match methods from new interface.
- Did some fixes of FlannBasedMatcher.
- Implemented BruteForceMather L2-specializations using Eigen library for all match methods (not only one-to-one as before).
- match() and knnMatch() are faster up to 4 times, radiusMatch() – up to 2 times (because we do not need to look for the nearest).
- Prepared the opencv branch “f2d_ros_chack” to switch ROS on this and check backward compatibility. Maria
Shape recognition:
- Considerable improvement in the implementation of FDCM. Key steps:
- Improved orientation estimation and removed points with hard-to-estimate orientation,
- Implemented manual templates scaling instead of linear interpolation:
- scaled points from contours, approximated them using ApproxPolyDP and then interpolated orientation,
- Used findContours from the chamfer_matching package instead of this function from OpenCV,
- Used precise Distance Transform,
- Implemented processing of closed contours. [Ilya]
Anatoly Baksheev (CUDA Implementation
Accomplishments: 1) OpenCV GPU module · implemented gpu version of magnitude, magnitudeSqr, phase, cartToPolar, polarToCart (that is needed for HOGs) · Added new cvtColor functionality (RGB <-> HSV, RGB <-> HLS) · implemented via NPP magnitudeSqr and rectStdDev. We've covered all NPP functionality useful for OpenCV. · Sobel filter for signed output types is in progress (needed for HOGs). It can’t be implemented using NPP. 2) HOG. We switched to HOG development. To the moment we have read 3 papers about HOG, played with people detection sample (OpenCV CPU HOGs + Linear SVM), started learning its sources. So now we are preparing to port them to GPU. 3) SURF, SIFT and Fast. We decided to work on SURF in parallel with HOGs. · Learned paper about FAST point detector algorithm. Our conclusion that is could be ported to GPU with good speed-up. Implementation will take ~1 day. But fast FAST is not useful without fast SURF. · Now we are learning papers about SUFR and investigating different SUFR CPU and GPU implementations. The sources have GPL license so we have to create our own implementation. 4) DoubleBP. Fixed bugs and tested DoubleBP with BM-style correlation volume (fast replacement of approach from the paper). With it the algorithm has gone to the bottom of Middlebury’s rating. So the very slow part which we replaced contributes significantly in DoubleBP’s quality. Also even now total work time of the algorithm is about 5-10 times slower than BP with approximately the same quality. Therefore we froze its further development. Plans: · Work on HOGS, SURF, SIFT, FAST GPU implementations. · Finish Sobel for signed output types.
Action Items
Gary
Vadim
Victor
Kurt
From last time
Gary
- (./) Get Victor a Nexus One
- (./) Ask Ethan about Android build scripts
- (./) Contract with Steve
Vadim
Victor
Kurt
Agenda
- Track the release
- OpenCV # 2 ideas
- On schedule?
- General status
- Doc status
- User annotate
- List of what needs docs
- More and more useful example code
-
TOD
- Plans
- Status?
Working idea’s list:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- (./) Reorganized documentation (that reflects the new OpenCV structure)
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- Google test is integrated and works, but tests are not written in it yet
- :\ ’’Active work_Python support extended to the new C++ functionality & MLL
-
Need to look at pyopencvNo, Vadim is writing a universal Python
-
- (./) Android support in the build system, (and optionally iphone)
- :\ More bugfixes (reported in the bug tracker).
- :\ Finish features2d interface for one to many image matching (in progress by Maria)
- (./) Include interface to PASCAL Visual Object Challenge (VOC) and bag of words matching done by Ken Chatfield
- :\ Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- We need external API fixed and a sample written
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- (./) 3 inline camera stereo.
- Needs discussion with Kurt about when the camera is too far out of left<=>right image planes. Then there is a problem with scale factors
- (./) Updated FLANN is incorporated into OpenCV.
-
Release date is end of October- Release is delayed 2 weeks due to delay in completing python interface (for missing C functions but mainly for ML functions)
Minutes
- Documentation has been re-org’d to reflect new structure
- It now has new feedback link
- Users will have to register with willow wiki
- Then start the wiki page and click on the template to start it
- How to get an automatic empty page
- Users will have to register with willow wiki
- Need to get up
- It now has new feedback link
- State of Hudson build
- Hudson does not test trunk of opencv
- Build bot is improved http://argus-cv.dnsalias.org:8010/waterfall
- ROS fails
- Kurt
-
ESM (Efficient Second order Minimization ESM, see second paper “Real-time image-based …” http://www-sop.inria.fr/icare/personnel/sbenhima/index.html
- Matlab and binary version on the web
- Lab automation, vslam, sub-pixel accurate point finding
- Put Ilya on this
-
ESM (Efficient Second order Minimization ESM, see second paper “Real-time image-based …” http://www-sop.inria.fr/icare/personnel/sbenhima/index.html
- Release
- Python wrappers
- Getting ML learning methods to work is a problem right now. It has never worked and want to add
- Can this be put off to the next release?
- As soon as this is done, we can release
- Getting ML learning methods to work is a problem right now. It has never worked and want to add
- Need to clear out bugs
- Fixed build bot problems
- Python wrappers
TOD
- Much fewer cases where there is texture but it doesn’t find
- Ethan is going to integrate TOD into REIN
- Talked to Matei about REIN. Need to associate meshes of objects to TOD models
- For the first implementation, will use folder based DB.
- Would be good to combine into Matei’s grasping database
- Need to be able to capture a model registered
- Victor wants to put off joining CAD models and point clouds
- Doesn’t want to put into REIN yet
- Current datastructures are complex
- There is an object called “training set” that
- Would rather clean up code before, but if you want … it can be done
- Doesn’t want to put into REIN yet
- Not run on the 40 objects
- Run on Prosilica sub-set.
- Detection is stable on coffee-filter, cork-screw
- The interface is stable
- Ethan: Looks like in Rein, can make 1 nodelett for training set and detector
- One nodelet can spit out detection and/or pose
- Have to register with graspit models
- Eitan, and Matei have tools for aligning models
Vadim
- Python wrapper generator is pretty close to completion, but some more debugging is needed. MLL classes have been wrapped. Several proxy functions and classes have been added (only within the wrapper code) to cope with problems with overloaded functions.
- Ilya has reorganized OpenCV docs and added links to wiki to each OpenCV function description. When user clicks the wiki link, the corresponding wiki page is loaded into a separate frame within the same browser window. Very cool, actually!
- 13 bugs in OpenCV and OpenCV tests have been fixed – tickets ## 447, 504, 536, 564, 565, 566, 570, 579, 612, 613, 614, 615, 62#
Regards,
Victor
TOD: created several automated tools:
- extracting images from a set of bag files. This is a quite time consuming process that takes several hours for a set of 40 objects.
- automatically labeling images from just one image. A user is required to select a single frame and draw a mask around an object. The algorithm goes through all other frames, and uses stereo information together with checkerboard pose registration to select only those pixels that project to the input mask in the first frame. The process is two-stage: first we label all objects with no computation, then we start the processing of all other frames. This is implemented in scripts/label.py. The first stage takes around 5-10 minutes for a set of 40 objects.
- automatically training models from a set of labeled objects. This takes around 7-10 hours for a set of 40 labeled objects.
- automated test system for calculating accuracy and pose estimation (is in the testing phase now).
Significant improvement in the detection algorithm because of a change in clustering: we bring back the restriction for matches from different training images to exist in the same cluster. As a result we get clusters that have a higher inlier/outlier ratio.
[Victor, Alexander]
features2d:
- fixed problems with several tests revealed by buildbot:
- cascade-detector (#432),
- hog-detector (#428),
- descriptor-sift (#567),
- descriptor-surf (#568),
- detector-surf (#578). - Testing of new interface with vslam packages is in progress Maria
Anatoly Baksheev (CUDA)
1) OpenCV GPU module
· Prepared table with functionality list, supported types and flags. Some differences with OpenCV CPU were also reflected there.
· Finished Sobel implementation for all types for which sizeof(T) == # This is used in HOGs.
· Implemented GpuMat cv::gpu::compare(GpuMat, GpuMat, op) for op = NOT_EQ (this NPP does not support this).
· Implemented Transform algorithm in Thrust-like style for internal purposes only now. After in depth look into Thrust, we realized that it is not suitable for us (no async, internal mem alloc/free, no pitched memory support), but we may take ideas from it and create what we need.
2) HOG. We learned HOG code from OpenCV. Simplified it, because it was highly optimized for CPU, and ported all parts to GPU. So now we have first GPU version with hardcoded parameters. Its output is byte-wise equivalent with output from CPU version. Performance is not very fast (almost the same as on CPU). We are working now on its optimization, starting with the most slow kernel.
3) SURF status: Our own implementation is in progress.
Plans:
· Work on HOGs and SURF GPU implementations.
· Maybe create Brute Force descriptor matcher on GPU for effective SURF points matching.
Action Items
Gary
- Talk to James/Brian about getting the new docs up or why they are not up.
Vadim
- Finish the python
- Then look at Android
- Push changes back to Ethan
Victor
- Send TOD model alignment code
Ethan
- Manual alignment
- rein wrapping
Kurt
- Send links, code for ESM algorithm to Victor
From last time
Gary
Vadim
Victor
Kurt
Agenda
- Track the release
- OpenCV # 2 ideas
- On schedule?
- General status
- cout << Mat
-
ESM (Efficient Second order Minimization ESM, see second paper “Real-time image-based …” http://www-sop.inria.fr/icare/personnel/sbenhima/index.html
- Matlab and binary version on the web
- Hudson build
- Test code improvements
- Doc status
- User annotate
- List of what needs docs
- More and more useful example code
-
TOD
- Plans
- Status?
Working idea’s list:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- (./) Reorganized documentation (that reflects the new OpenCV structure)
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- Google test is integrated and works, but tests are not written in it yet
- :\ ’’Active work_Python support extended to the new C++ functionality & MLL
-
Need to look at pyopencvNo, Vadim is writing a universal Python
-
- (./) Android support in the build system, (and optionally iphone)
- :\ More bugfixes (reported in the bug tracker).
- (./) Finish features2d interface for one to many image matching (in progress by Maria)
- :\ Add BRIEF descriptor from ROS (roscd brief_descriptor)
- (./) Include interface to PASCAL Visual Object Challenge (VOC) and bag of words matching done by Ken Chatfield
- (./) Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- (./) We need external API fixed and a sample written
- :\ test code …
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- (./) 3 inline camera stereo.
- Needs discussion with Kurt about when the camera is too far out of left<=>right image planes. Then there is a problem with scale factors
- (./) Updated FLANN is incorporated into OpenCV.
-
Release date is end of October- Release is delayed 3 weeks due to delay in completing python interface (for missing C functions but mainly for ML functions)
Minutes
- Python
- Interface is completed. Almost all is working. There are a few bugs
- 2 days debug, 2 days sample
- Android is not tried yet
- More test code
- Test are failing on 32bit ubuntu 8.4 with GCC # #
- So this is excluded and this is messing up statitics
- We go back this far in Ubuntu because ROS is.
- This needs to be looked at
- Spend this week updating tests
- Updated numbers this Friday
- Test are failing on 32bit ubuntu 8.4 with GCC # #
- Features2D
- All improvements are committed
- Modified all packages in ROS
- There is test coverage for Features2D classes: detectors/descriptors/matching
- Ethan is using.
- Ethan committed a small change to brute force matcher. Some type issues.
- KNN search with integer distance. Easy for integer types to overflow … in brute force not FLANN
- If you look at BRIEF in ROS. Integer match takes a hamming distance with unisgned int. Made KNN fail, had to change to signed
- Vadim: what is the length of the vector? Never more than 250
- Ethan got working
- Is this a documentation or usage issue?
- One of the matcher function has the same signature as another.
- But, new users won’t run into this problem
- Status of BRIEF
- Message from Michael Calandor, wants in
- Have in ROS from Patrick
- Should probably take that version
- Try to get in
- It is not rotation or scale invariant as in paper
- Can look at example tests
- in ROS (roscd brief_descriptor)
- Brief has example code on use: video_homography.cpp and match_test.cpp
- There is a 16, 32, and 64 byte descriptor version. Use at least 3#
- 2 version of hamming distance. GCC and a look up table version
- Ilya is working on ESM
- Reproduces original results fairly well
-
TOD
- New training instructions with crop_auto works
- Images are recognized (5)
- Projects pose onto the bottle
- Skype exchanged between Ethan and Alexander
- Multi scale FAST is not used in TOD
- Plan tomorrow to start with Multi-scale TOD
- Pose might benefit from sub-pixel, but descriptors will too
- ESM will do this, but this is later
- cout << cv::Mat
- Ethan has lots of these
Ilya Lysenkov (ESM tracking report)
Accomplishments:
- Implemented tracking using Efficient Second-order Minimization (ESM tracking).
- Convergence rate is comparable with the authors’ implementation.
- Non-optimized version is working with 8fps (512×512 frame, 150×150 template) on the server.
- My implementation and the authors’ one fail if initial homography estimate is too bad (e.g. projected template is 50 pixels away from the true position) because approximations in the algorithm become incorrect. So the algorithm is suitable for tracking but not for detection.
- See attachments for examples of tracking.
- Implemented Fitzgibbon’s method for registration in SolvePnP but not tested it.
Issues:
Plans:
- Experiment with ESM tracking on the bio_box.
Attachments:
Vadim
- Python wrappers have been completed. Except for a few small bugs that are now being investigated, the wrappers work well.
Plans:
- Scan the bug tracker.
- Update documentation; write some samples in Python
Anatoly Baksheev (GPU)
Accomplishments: 1) HOG. Fixed bugs. The code was analyzed and some optimizations were done. Speed-up is about 12x for Fermi (vs. CPU version). For our data set, kernels call overheard is significant. So for higher resolutions we may get even more performance increase. We are going to prepare another test data and test hog on it. Also we still have some optimization ideas to try. I will merge all sources and send HOG code to James tomorrow. 2) SURF. Almost finished implementation. Now we are testing it. Speed-up is about 10x (depends on parameters). Started work on implementation of SURF descriptor matcher on GPU (Brute Force). 3) GPU module and other. · Small changes in GPU module (added utility classes, compiles by nvcc). We think from time to time about introducing new Thrust like layer into the module, for OpenCV users and for internal purposes. · Compiled and tested DemoCVPR with Cuda # # · We learned that NPP for Cuda # 2 RC1 under Linux does not contain Npp32fc type and some functions that are present under win3# So GPU module is not compiled because of this. Plans: · Finish optimizations for HOG, add multi-scale capability, add possibility to change descriptor parameters (now they are hardcoded), integrate into OpenCV. Maybe consider another HOG based descriptor. · Implement Brute Force Matcher for Surf descriptors on GPU. · Verify all our old issues with CUda# 2 RC# Other: This Thursday and Friday are holidays in Russia. Next Saturday is a working day instead.
Action Items
Gary
- Update docs for example code.
Vadim
- Finish python
- Make sure the failing tests are updated.
- Get Gary updated test statistics by Friday … (Monday OK)
- (./) Forward progress on ESM
Victor
- Have NNU write test code for Latent SVM
Kurt
Ethan
- Point Vadim to BRIEF in ROS
- Send Vadim cout << cv::Mat
From last time
Gary
- (./) Talk to James/Brian about getting the new docs up or why they are not up.
Vadim
- :\ Finish the python
- Then look at Android
- Push changes back to Ethan
Victor
- (./) Send TOD model alignment code
Ethan
- Manual alignment
- (./) rein wrapping
Kurt
- (./) Send links, code for ESM algorithm to Victor
Agenda
- Track the release
- OpenCV # 2 ideas
- On schedule?
- General status
- cout << Mat
- ESM (Efficient Second order Minimization ESM, see second paper “Real-time image-based …” http://www-sop.inria.fr/icare/personnel/sbenhima/index.html
- Hudson build
- Test code improvements
- Doc status
- User annotate
- List of what needs docs
- More and more useful example code
-
TOD
- Ethan’s API
Key release updates:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- (./) Reorganized documentation (that reflects the new OpenCV structure)
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- Google test is integrated and works, but tests are not written in it yet
- (./) ’’Active work_Python support extended to the new C++ functionality & MLL
-
Need to look at pyopencvNo, Vadim is writing a universal Python
-
- (./) Android support in the build system, (and optionally iphone)
- :\ More bugfixes (reported in the bug tracker).
- (./) Finish features2d interface for one to many image matching (in progress by Maria)
- :\ Add BRIEF descriptor from ROS (roscd brief_descriptor)
- (./) Include interface to PASCAL Visual Object Challenge (VOC) and bag of words matching done by Ken Chatfield
- (./) Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- (./) We need external API fixed and a sample written
- (./) test code … (but fails on some platforms)
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- (./) 3 inline camera stereo.
- Needs discussion with Kurt about when the camera is too far out of left<=>right image planes. Then there is a problem with scale factors
- (./) Updated FLANN is incorporated into OpenCV.
-
Release date is end of October- Release is delayed 3 weeks due to delay in completing python interface (for missing C functions but mainly for ML functions)
Minutes
- Code coverage spreadsheets
- Switch to new format (no longer corresponds 1-1)
- Bugs
- 36 bugs were fixed,
- 100 more there of lower priority
- Python done
- documentation for new wrappers is missing
- samples
- “technology preview”. Use C++ docs as a starting point
- cout << Mat … not put in yet
- Android
- put cv prefix
- Test coverage
- Headers are now mixed with sources and so they show in test coverage now
- Some headers have inline, so probably better the way it is now
-
BRIEF descriptor
- Should do, but not document right now
- It will be there, but undocumented
- Need to compare results with data in original paper
- Brute force matcher and hamming functor distance measure
- Which pattern is implemented
-
ESM
- GPU version that Hauke Strasdat is looking at
- Performance
- For tracking, can get off if it is too far away
- Test against the test implementations in Matlab and C
- Test against the code on the web — this was done
- Still does not track fast movements
- Can we get link to code (assigned to Victor)
- Victor thinks finding 3D pose will jiggle depending on mono-views
- We just want rough estimates to camera rotations
- Should work well with planar objects, less well with other
- With other parameterizations for non-planar
- Pure rotation
- Rotation + scale
****
- With other parameterizations for non-planar
-
TOD
- Gary put up test data
- 4 objects at once
- In Munich, talked with Slobodan Ilic who wants this data
- Need to write
- Paper studying descriptors would be good
- Victor with Radu and Brian
Vadim
# The whole last week was spend in cleaning the bug tracker, especially the tickets about the failed tests. 36 tickets have been closed: ## 70, 323, 364, 430, 433, 436, 437, 439, 451, 454, 457, 464, 473, 477, 483, 484, 502, 506, 522, 525, 535, 543, 569, 572, 575, 576, 577, 579, 602, 606, 618, 619, 620, 639, 643, 65# Some of those tickets were obsolete bug reports, some required minor fixes in the tests (like raising too tight error threshold), but a dozen or so were real bugs in the library that have been fixed. # Build opencv on android using Ethan build guide at WG wiki; opencv builds fine. no demo applications have been tried yet on the simulator. Plans: # Continue to clean bug tracker. # Update documentation; write some samples in Python
Victor
Object recognition:
- Fully automated test system that runs through labeled training and test files
- Only one labeled frame per object is needed) and counts recognition and pose estimation statistics has been implemented.
- The system is being tested on a set of 12 test objects. [Alexander]
Lab automation:
- Experimented with ESM tracking on the bio_box dataset.
- ESM can track the bio_box in the case of small and slow motion (sent a video of ESM tracking to Victor).
- Accuracy drops significantly in the case of big rotations
- May be due to large change in the appearance of the tubes so using of several templates could help to process this case.
- Also ESM lost the object if motion is too fast.
- Created an algorithm to detect bio_box when ESM is lost due to fast motion. The algorithm detects FAST keypoints and then uses radiusMatch.
- The main idea is not to try finding feature correspondences by RANSAC but instead try to determine whether a keypoint is from the object or not.
- This is achieved by counting returned matches for keypoint.
- The bio_box has repeating texture so keypoints on the object have many matches but other keypoints have a few.
- It hints a confidence of the object presence. Computed confidences are passed to CamShift which determines a final position of the object. See the segmentation results in the attached image. [Ilya]
- The main idea is not to try finding feature correspondences by RANSAC but instead try to determine whether a keypoint is from the object or not.
features2d:
- Worked on features2d documentation:
- added description of new functions/classes/methods (for BOW, draw,..),
- modified some existing descriptions due to interface changes (DescriptorMathcer, GenericDescriptorMatcher,..),
- did minor restructuruing of features2d doc. Maria
Other:
- Victor has participated in ROS workshop in Munich http://www.ros.org/wiki/Events/CoTeSys-ROS-School.
- He has taught a 2 hours class on OpenCV and vision in robotic applications.
- Together with Radu, he has also led perception practice. A lot of feedback on OpenCV documentation and user problem cases has been gathered.
- Victor (thanks to Radu) has also met with several people from DLR including
- Heiko Hirschmuller who provided insight on why we are getting different results with our implementation of semi-global optimization.
- Rene Wagner (a PhD student from DFKI Bremen) is playing with OpenCV stereo algorithms now and will try Heiko’s advices before Vadim is free from all the release activity.
- Halcon head Carsten Steger
- Slobodan Ilic from TUM. A lot of interesting discussions on the current state of shape descriptors as opposed to keypoint descriptors.
- He is interested in the object dataset we are creating.
Anatoly Baksheev (GPU)
Accomplishments: 1) HOG · Tried several variants of histogram calculation kernels, optimization iterations for each. Classify kernel and l2hys normalization kernel were also reworked. To the moment our best implementation provides speed-up about 20-30x for Fermi depending on parameters. Think that for some bigger descriptor sizes, the speedup may be higher. · Most consuming part now is a classification kernel (just a dot product) because of amount of work. So we are close to the finish. · We still have ideas to try, but this won’t give more than 20% and are applicable only for Fermi (48k smem). · Plan to allow user to manipulate descriptor parameters (bigger winSize, cellSize, blockSize, winStride, blockStride, etc.). Now they are hardcoded. This may cause fight for registers and occupancy. After we will integrate into OpenCV. 2) SURF · Implemented simple BruteForceMatcher (for each descriptor in query set finds the closes from train set). Speed up is about 100-200x. This matcher can be used for any descriptor with any size, not only for SURF. · We did not do any major modification for SURF itself and it still shows 10x speed up and support only 64 * sizeof(float) descriptor size. · Plan to focus on SURF + BFM performance and functionality (maybe 128-float SURF, find all descriptors within given radius, different norms, etc.) 3) OpenCV GPU · Fixed compilation errors under Ubuntu. This was because NPP under Linux does not contain some types and functions (example Nppi32fc) that are present under windows. We did not do it before because of our ‘megabug’. · Created local version of FindCUDA.cmake script to distribute with OpenCV. The script included in CMake does not find Cuda# 2 (toolkit directory structure was changed). We can’t wait next CMake release more. The script is been testing. · Plan to verify all our old issues with Cuda# 2 RC# Regret that no new NPP build is available. · Began looking into Antons’ face detection GPU code. Plans: · Finish HOG work, integrate into OpenCV. · Focus on SURF + BFM performance and functionality · Verify all our old issues with CUda# 2 RC# Other: 2 days were holidays. This Saturday is a working day instead.
Action Items
Gary
- (./) Put up new coverage
- Create TOD datasets
- Talk to person about IP
- Update docs for example code.
Vadim
- Reformat coverage spreadsheets to maximize overlap with past format
- Put in cout << cv::Mat
- Send Latent SVM test fails on some platform to Alexi
Victor
- Send Kurt the link to ESM code
Kurt
Ethan
- Add Brief to features2D
- Sample
- Test
From last time
Gary
- :/ Update docs for example code.
Vadim
- Finish python
- Make sure the failing tests are updated.
- :/ Get Gary updated test statistics by Friday … (Monday OK)
- (./) Forward progress on ESM
Victor
- (./) Have NNU write test code for Latent SVM
Kurt
Ethan
- Point Vadim to BRIEF in ROS
- Send Vadim cout << cv::Mat
Agenda
- Track the release
- OpenCV # 2 ideas
- On schedule?
- General status
- PASCAL VOC connection? What is the status? Docs? Examples?
- Hudson build
- Test code improvements
- Doc status
- List of what is missing
- More and more useful example code
-
FUTURE:
- TOD progress
- Stephan Holtzer’s closed contours
- Solutions in Perception Challenge
- ESM (Efficient Second order Minimization ESM, see second paper “Real-time image-based …” http://www-sop.inria.fr/icare/personnel/sbenhima/index.html
- Victor’s trip? Time set yet?
Key release updates:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- (./) Reorganized documentation (that reflects the new OpenCV structure)
- (./) Documentation auto-generated for wiki — include user feedback page for each function.
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- Google test is integrated and works, but tests are not written in it yet
- (./) ’’Active work_Python support extended to the new C++ functionality & MLL
-
Need to look at pyopencvNo, Vadim wrote a parser so that OpenCV now has a complete pythong interface and compatibility with numpy arrays
-
- (./) Android support in the build system, (and optionally iphone)
- :\ More bugfixes (reported in the bug tracker).
- (./) Finish features2d interface for one to many image matching (in progress by Maria)
- (./) Add BRIEF descriptor from ROS (roscd brief_descriptor)
- (./) Include interface to PASCAL Visual Object Challenge (VOC) and bag of words matching done by Ken Chatfield
- (./) Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- (./) We need external API fixed and a sample written
- (./) test code … (but fails on some platforms)
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- (./) 3 inline camera stereo.
- Needs discussion with Kurt about when the camera is too far out of left<=>right image planes. Then there is a problem with scale factors
- (./) Updated FLANN is incorporated into OpenCV.
- (./) cout << cv::Mat
-
Release date is end of November
- Release waas 3 weeks due to delay in completing python interface (for missing C functions but mainly for ML functions) and clean out all important bugs
Victor Nov 23 around 1 30 — Dec 5th
Minutes
RELEASE
- Bug fixes
- Important bugs left to be fixed ~12
-
API review of Features2D — minor changes in header files
- Some functions will change names in API
- example is cloneWithoutData will be clone(bool data);
- Maria will finish tomorrow
- Some functions will change names in API
-
PASCAL VOC loading code
- Supposed to be in samples
- Test code
- Plan to create a couple of tests using Google test code for user examples
- Existing tests will remain as is to save work … and more bugs
- user contrib exists
- need a way to allow user to be able to selectively call this in
- Cmake scripts now contain a single line for modules and its dependencies
- Use global expressions
- Or another top level directory such as contributed
-
ESM
- Kurt talked to
- Matlab code might not be complete
- Working with full 3D objects
- This is now in ROS, posest
- Eigen
- … eventually support
- OpenCV now has conversion matrices back and forth.
- They use full copy of the data
- New docs
- Edit tex files
- Or add new tex files
- but then update the master latex files
- OpenCV doc directory
- Files start with module names
- create image_processing_stitching.tex
- Add to the master tex files
- Files start with module names
- OpenCV doc directory
- but then update the master latex files
- Then you can use scripts to add on line docs and pdfs
- Online docs are generated by → .rst (James script) → sphinx tool → html
- Need to run the script called latex2sphinx subdirectory. Script is “buildall”
- We don’t use doxygen because it becomes very verbose, especially with complex interfaces. Python does better.
- Maybe with foundation, we can get a technical writer
- Online docs are generated by → .rst (James script) → sphinx tool → html
- Hudson build system
- Some issues with test system. Some crashes need reboot. Right now it autoreboots on crash
- Anatoly is going to fix this after the release because we need the regular builds right now for the release and this isn’t part of the release
- Some issues with test system. Some crashes need reboot. Right now it autoreboots on crash
- Work left
- Bugfixes
- docs for features2D
- Update version number, release notes etc
- package up code for release
- Run the tests for all the platforms and configurations
TOD:
- Lots of small fixes
- Main problem — We match more descriptors to allow for low texture objects
- bad matches fall close together and are clustered
- so reprojection error is not large
- so quite a few false alarms for background objects that match to training set
- bad matches fall close together and are clustered
- Ransac pnp for each cluster
- found pose, look for high reprojection error
- find rough inliers
- findPnP again with stricter reprojection error
- then if supporting cluster is still small, use stdev measure ratio between training and test
- if this is different, reject
- then if supporting cluster is still small, use stdev measure ratio between training and test
- Most TOD changes involve 1 object out of N where N > 50
- Using FLANN in this case
- MOPED is using brute force and claim to work with 100 objects
- In bin picking, you only have 1 object that you are looking for
- Patrick is using TOD in bin picking for grasping
- Ethan’s effort
- Have dataset of 43 objects
Closed contours
- Stephen Holtzer may contribute this to opencv. It uses thresholding, lines and distance transforms to do line completion for close contours.
Victor
-
TOD:
- Experiments with 43 objects in the training base have shown detection issues. A lot of objects were not detected because of the “feature starvation” — few features from correct object were selected by matching.
- In order to deal with this we now search for 30 nearest neighbors and run ratio test for each of the objects in this set independently, getting several matches instead of just one.
- The detection rate has increased dramatically together with false positive rate. Now we are working on reducing false positives.
- Other improvements in TOD:
- A new success criterion for the test system has been implemented. It takes into account occlusions of test objects. Bug in pose filtering has been fixed. Parameter tuning. [Alexander]
- Experiments with 43 objects in the training base have shown detection issues. A lot of objects were not detected because of the “feature starvation” — few features from correct object were selected by matching.
-
Features2d:
- Bugfixes.
- Fixed grabCut segmentation accuracy in the case of singular covariance of some gaussian component.
- Fixed MserFeatureDetector when keypoint size estimation is equal to 0.
- Resolved the tickets: 595, 485, 68, 423, 666, 305, 373, 354, 372, 665, 353, 265, 672, 673, 59#
- Fixed compile error under win (after BRIEF adding). [Maria, Ilya]
- Bugfixes.
-
Lab automation:
- Implemented homography parametrization by different Lie algebras to use various transformations (e.g. affine, euclidean, translation and so on) in ESM tracking.
- Refactored ESM code and integrated it in the posest package of the vslam stack in ROS.
- Created regression test for ESM tracking using results of the authors’ matlab script as ground truth data.
- Improved ESM accuracy from 1.12 to 0.95 (this is average ratio of the SSD error returned by my implementation to the error returned by the matlab script. * This ratio is computed after each iteration). [Ilya]
- Implemented detection of test-tubes holes using adaptiveThreshold and morphology (see attachments).
- A quick experiment has shown that BRIEF can be used to filter out the most of the outliers.
- The detection of a pattern based on geometric hashing is in progress
- See attached images [Ilya, Victor]
- All keypoints:
- Filtered keypoints
Vadim
18 tickets have been closed:
- 71, 131, 157, 161, 281, 341, 385, 430, 434, 468, 497, 524, 557, 572, 575, 591, 671, 678
Plans:
- Fix remaining important bugs.
- Update documentation; write some samples in Python
- Update installer on Windows
Anatoly Baksheev
Accomplishments:
1) HOG· Added possibility to change some parameters, multi-scale mode. Code was refactored, made more readable and integrated into OpenCV. Created regression test. NPP resize is replaced with our own, because of bug in old CUDA.
· Its performance is from 3x to 30x of OpenCV HOG CPU for Fermi 470. We don’t see any way to increase it. Top speed is achieved for high resolutions and for big HOG windows. For low resolutions (and for some scale levels in multi-scale mode) CPU is faster.
· Created sample and simple demo with HOGs.
2) SURF & Brute Force Descriptor Matcher
· Improved SURF_GPU code (removed unnecessary classes, moved some constants to parameters). Fixed bug in SURF_GPU in per warp reduction. We found that results still differ from run to run. That means here is another bug. Trying to find it.
· Finished implementation of descriptor matching with arbitrary types and sizes (universal case, ~100x faster than simple CPU implementation, 15-20x faster than Eigen library, Core 2 Duo E8400, SSE, OpenMP).
· Implemented ‘find descriptor within radius’ functions (~10x faster than simple CPU implementation, 3-6x faster than Eigen library on Core 2 Duo E8400, SSE, OpenMP)
· Implemented support of train collection (list of train sets) and masks for BTF. Now the matcher almost repeats corresponding CPU matcher interface (left to implement different norms support).
3) OpenCV GPU – minor improvements and tasks.
· Verified the ‘megabug’. It was fixed by NVidia. But we still have instability in Stereo Demo and Gpu Module. We guess that is because we use old NPP that is linked with old CUDA.
· Debugged and fixed error reporting in OpenCV + Stereo Demo (exceptions in destructors uncaught exceptions in background threads).
· Refactoring. Added internal utility classes and functions (numeric_limits_gpu, shared mem with template, bindTexture, uploadConstant, draft of GpuMat with template). Fixed OpenCV CPU headers, so ‘gpu.hpp’ header can be compiled with NVCC.
· Local FindCUDA.cmake script is still being tested.
4) Anton’s face detection code.
· First look of this code. Understand some interfaces. Send comments to Anton. Will continue communication about the code integration.
Plans:
1) Focus on SURF bugs, performance and functionality.2) May be start implementation requested functions form the Joe’s list.
3) Think what else we could do for HOGs.
Action Items
Gary
- Create a user contrib
- (./) Ask about Victor in the guest house
- (./) Sync up bin picking effort with Patrick
- Talk to person about IP
- Update docs for example code.
Vadim
- (./) code coverage
Victor
- Sync up with Ethan on textured_object_detector
Kurt
Ethan
- Sync up with Victor on tod
From last time
Gary
- (:/) Update docs for example code.
Vadim
- (./) Finish python
- :\ Make sure the failing tests are updated.
- (./) Get Gary updated test statistics by Friday … (Monday OK)
- (./) Forward progress on ESM
Victor
- (./) Have NNU write test code for Latent SVM
Kurt
Ethan
- (./) Point Vadim to BRIEF in ROS
- (./) Send Vadim cout << cv::Mat
Agenda
- Release tracking:
- OpenCV # 2 ideas
- On schedule?
- General status
- PASCAL VOC connection? Docs?
- Hudson build
- Test code improvements
- Doc status
- List of what is missing
- More and more useful example code
-
FUTURE:
- TOD progress
- Stephan Holtzer’s closed contours
- Solutions in Perception Challenge
- ESM (Efficient Second order Minimization ESM, see second paper “Real-time image-based …” http://www-sop.inria.fr/icare/personnel/sbenhima/index.html
Key release updates:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- (./) Reorganized documentation (that reflects the new OpenCV structure)
- (./) Documentation auto-generated for wiki — include user feedback page for each function.
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- Google test is integrated and works, but tests are not written in it yet
- (./) ’’Active work_Python support extended to the new C++ functionality & MLL
-
Need to look at pyopencvNo, Vadim wrote a parser so that OpenCV now has a complete pythong interface and compatibility with numpy arrays
-
- (./) Android support in the build system, (and optionally iphone)
- :\ More bugfixes (reported in the bug tracker).
- (./) Finish features2d interface for one to many image matching (in progress by Maria)
- (./) Add BRIEF descriptor from ROS (roscd brief_descriptor)
- (./) Include interface to PASCAL Visual Object Challenge (VOC) and bag of words matching done by Ken Chatfield
- (./) Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- (./) We need external API fixed and a sample written
- (./) test code … (but fails on some platforms)
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- (./) 3 inline camera stereo.
- Needs discussion with Kurt about when the camera is too far out of left<=>right image planes. Then there is a problem with scale factors
- (./) Updated FLANN is incorporated into OpenCV.
- (./) cout << cv::Mat. Supports output to opencv, python, numpy and matlab formats
-
Release date is end of November
- Release waas 3 weeks due to delay in completing python interface (for missing C functions but mainly for ML functions) and clean out all important bugs
Victor Nov 23 around 1# 30 — Dec 5th
Minutes
- Bug tracker is being cleared
- About 50 reports marked as major left, but many are silly or very specific (not major — 2 cameras not working in some environment)
- Probably all real major problems have been resolved
- Vadim wants 3 days to look through the rest
- cout << cv::Mat
- This works and can output for python, matlab etc
- Should there be an >> operator for cv::Mat?
- Or write an xml or yml and have openCV read it in?
-
TOD
- Progress, decreased size of training base
- False positives — 2 types: # Find objects non-existing objects and # wrong pose
- 4 false positives of first type
- Similar logos
- Send examples
-
REIN in ROS Recognition Infrastructure
- Will move TOD into this framework today
- Documentation
- Convert to Sphinx directly will eliminate one step
- Sphinx can be produced from this and it looks good
- More like wiki with latex math syntax
- Sphinx can be produced from this and it looks good
- Convert to Sphinx directly will eliminate one step
- VSLAM
- Lab automation
Victor
-
TOD:
- Struggling with high false positive rate on a dataset of 43 training objects. Several bugs have been fixed.
- The algorithm for choosing a winner is changed.
- We used to choose an object with the maximum amount of inliers, filtering out objects with small support area (defined as a ratio of a standard deviation of 3d points corresponding to inliers to the standard deviation of the whole point cloud of the training object).
- The new criterion is vise versa: the winner has maximum support area from all hypotheses with the number of inliers higher than a threshold. The standard deviation is based on robust estimation now (10% of outliers are filtered out).
- Loading of a training set is speeded up 3 times.
- The current results for three test bag files with clutter are below. One of the reasons for high false alarm rates is misclassifying similar logos (e.g. corn_holder and steak_knives). [Alexander]
Current results: detection rate false positives rate /shared/binpick/test_base//test1_cards/ 0.952381 0.0190476 /shared/binpick/test_base//test3_avocado_slicer/ 0.844037 0.256881 /shared/binpick/test_base//test3_gogo_brush/ 0.711712 0.27027 /shared/binpick/test_base//test1_4medals/ 0.947917 0.21875 /shared/binpick/test_base//test3_baking_cup/ 0.880734 0.284404 /shared/binpick/test_base//test1_coffee_filter/ 0.719512 0.0243902 /shared/binpick/test_base//test3_mommy_messages1/ 0.733333 0.32381 /shared/binpick/test_base//test2_drain_stopper/ 0.914894 0.0638298 /shared/binpick/test_base//test2_party_parasols1/ 0.913043 0.0652174 /shared/binpick/test_base//test2_toy_scissors/ 0.989583 0 /shared/binpick/test_base//test1_clogx/ 0.807692 0.0288462
-
Features2d:
- Refactored features2d and sample on matching to many images after code review (some renames, clone(), constructors parameters, comments,… ).
- Fixed processing of cases where an input image, keypoints vector or descriptors matrix is empty.
- Closed tickets 670, 65#
- Documentation update after the API review is in progress. Maria
-
Lab automation:
- An algorithm for detecting bio box has been implemented. Keypoints found by hole detector are merged into an adjacency graph (first, all points closer to each other than a threshold are connected with an edge, then long edges are filtered out by k-means clustering into 4 clusters corresponding to principal orientations of a grid). Then an approximate search for a maximum isomorphic subgraphs in the constructed graph and the training 8×12 grid is performed.
- BRIEF is used to infer the position of the missing line of points if the found subgraph is too small.
- The results on a test bag file are shown in the attached video. [Ilya]
- Other: Alexander was on vacations for two days, Victor was on a sick leave for two days — he is now in flight to the US.
Vadim
- Last week I continued to clean the bug tracker. 39 tickets have been closed:
- 111, 116, 264, 275, 299, 300, 316, 349, 381, 384, 409, 420, 478, 514, 518, 519, 529, 553, 544, 550, 558, 588, 589, 590, 624, 625, 642, 651, 656, 659, 664, 671, 678, 681, 682, 690, 691, 692, 69#
- Plus an additional 3 tickets from SF tracker.
- two of them are quite remarkable:
- 1) one is multi-threaded implementation of SURF detector. So we now have 2x faster SURF detector/descriptor.
- 2) another one is big patch for EM algorithm, which should significantly improve the accuracy.
- started updating documentation to reflect the new/modified functionality (in particular merging of Mat and MatND)
- Windows installer update is in progress. All the scripts are modified and the testing is started.
- Finally, SF bug tracker is closed. All the tickets will be transferred to code.ros.org.
Plans:
- Fix remaining important bugs.
- Finish updating the documentation; write some samples in Python
- Test OpenCV on all the platforms
- Write the Release Notes & ChangeLog; update version number; prepare the packages for the final testing
Anatoly Baksheev (GPU)
Accomplishments: 1) HOG. Added possibility to set variable winStride, Implemented ‘getDescriptors’ functions (via CUDA kernel, it merges pieces of histograms from many addresses into single array in Dalal’s format), HOG was able to work only in detection mode before. Fixed HOG sample compilation under Ubuntu. Fixed incorrect scale bug. Added people classifier for winSize = 48x9# 2) SURF and Brute Force Matcher. Optimized ‘find closest descriptor’ function. Now ~20-30x faster than Eigen (was ~15-20x). Found and fixed bug in ‘find all within radius’ function. Implemented ‘find n nearest neighbor’ function (10x faster than CPU). SURF status – without any change this week. 3) OpenCV GPU. · Added bitwise operations for GpuMat. Operations with mask were also implemented. Started work on countNonZero, cvMinMaxLoc. · Updated error reporting for NPP (now we manually transform error code returned to message string, it would be great if NPP has this). Fixed compilation error for Linux (because headers of previous version of NPP for Linux does not contain some error codes.) Plans: 1) Continue work on SURF performance and functionality, may be it time to integrate it into OpenCV. 2) Continue implementation requested functions form the Joe’s list. 3) Verify all our issues with next release of Cuda3# Investigate reasons of Stereo Demo crash. 4) Begin Anton’s code integration.
Action Items
Gary
- Take a look at >> for cv::Mat
- More realistic test set
- Send Ilya plate pictures
Alexander
- Send objects that have type 1 false positives
Vadim
Victor
Kurt
Ethan
From last time
Gary
- :\ Create a user contrib
- (./) Ask about Victor in the guest house
- (./) Sync up bin picking effort with Patrick
- Talk to person about IP
- (./) Update docs for example code.
Vadim
- (./) code coverage
Victor
- Sync up with Ethan on textured_object_detector
Kurt
Ethan
- Sync up with Victor on tod
Agenda
- Release tracking:
- OpenCV # 2 ideas
- Schedule
- Docs
- PASCAL VOC Docs?
- Missing background subtraction docs
- Samples docs
- Hudson build
- Test code improvements
- Doc status
-
FUTURE:
- Panoramic stitching (Ethan)
- Really usable apps
- Planar objects stack
- TOD progress
- Stephan Holtzer’s closed contours
- Actively run solutions
- ESM (Efficient Second order Minimization ESM, see second paper “Real-time image-based …” http://www-sop.inria.fr/icare/personnel/sbenhima/index.html
Key release updates:
Key:(./) = Done; :\ = Ongoing, probably make it; :( = Behind; {X} = Cut from list/will not make it.
- (./) Improved camera calibration (wide-angle lens support …)
- This is now the default, flags can be used to fix the new parameters of the ratio model
- (./) Reorganized documentation (that reflects the new OpenCV structure)
- (./) Documentation auto-generated for wiki — include user feedback page for each function.
- (./) Google test support, sample tests (so that users can add new tests together with the new functionality)
- Google test is integrated and works, but tests are not written in it yet
- (./) ’’Active work_Python support extended to the new C++ functionality & MLL
-
Need to look at pyopencvNo, Vadim wrote a parser so that OpenCV now has a complete pythong interface and compatibility with numpy arrays
-
- (./) Android support in the build system, (and optionally iphone)
- (./) More bugfixes (reported in the bug tracker).
- Exept Highgui issues. Some camera issues on linux, but we don’t have these, so we can’t fix yet
- Python binding tickets — some should be analyzed by James
- (./) Finish features2d interface for one to many image matching (in progress by Maria)
- (./) Add BRIEF descriptor from ROS (roscd brief_descriptor)
- (./) Include interface to PASCAL Visual Object Challenge (VOC) and bag of words matching done by Ken Chatfield
- (./) Include latent SVM detector done by Nizhny U (in progress, this depends on U folks, they are aware of tight schedule, (./) agreed to ratify external API with us by Sep 20)
- (./) We need external API fixed and a sample written
- (./) test code … (but fails on some platforms)
- (./) Add algorithmic/timing tests and documentation on features2d
- Part of this awaits ratification (cmatch 1 image against multiple images)
- (./) Replace inclusions of old OpenCV headers by new ones (obtained after OpenCV restructuring) throughout the library to optimize building time.
- Some tests still include old style headers … plan to fix by release
- (./) 3 inline camera stereo.
- Needs discussion with Kurt about when the camera is too far out of left<=>right image planes. Then there is a problem with scale factors
- (./) Updated FLANN is incorporated into OpenCV.
- (./) cout << cv::Mat. Supports output to opencv, python, numpy and matlab formats
-
Release date is end of November
- Release waas 3 weeks due to delay in completing python interface (for missing C functions but mainly for ML functions) and clean out all important bugs
Victor Nov 23 around 1# 30 — Dec 5th
Minutes
- Docs
- There is a C++ api to background subtraction, but no docs for background subtraction
- put in stub for background
- Victor should document bag of words sample, PASCAL VOC use more
- Python
- James’s script is quite good, and produces light weight wrappers. Vadim sends thanks.
- Remaining nits for release
- Gary would like to add docs to output and header of all samples
- Gary would like to add focus detector and pyramid to contrib
- Vadim to add script for which functions are not documented
- This can also catch bugs for incorrect documentation
- Installation script looks good for Windows and Linux
- Linux tar ball works for mac: run cmake, then make, then make install
- For Android, just put in a link for android notes
- Vadim is writing the change log for the release
- New coverage
- Update cheetsheet pdf
- update version number,
- prepare packages,
- let people at Willow and Itseez try to install on various operating systems
- Future stuff
- Move docs to sphinx
- Ethan’s pano stitching to OpenCV
- Lighting into OpenCV
Gary
- Added docs to header and print out of > 20 samples
Plans
- Lots of spam filtering on the wiki site
- Finish of samples
- Open up page for them in he wiki docs
- Add 2 user contribs
Victor
-
TOD:
- Is live on PRL in willow green room! The recognition is good for the objects near the camera, getting worse with the distance.
- Several improvements in filtering false alarms, better visualization of the results.
- Minor code refactoring.
- Victor discussed with Ethan how the work should be shared, the new API has been drafted.
- Ethan will start working on the training base,
- Alexander will start refactoring the recognition object. [Victor]
-
TOD ongoing modifications:
- moved minstdDev (parameter for filtering solutions with compact inliers cluster because of unstable pose) parameter to config file
- changed null cloud filtering, now I have only 1 false positive of first type (detection of objects that are not present in a test image)
- implemented crop_auto for ROS messages (CameraInfo, PointCloud, Image). The plan is to join crop, crop_auto and crop_ROS into one pipeline. Began implementing a common class for this purpose (see crop_object_auto# cpp).
- finished test system modifications, now we calculate common FPs only once, furthermore it works three times faster. [Alexander]
-
lab automation:
- The code for detecting bio box holes has been refactored and committed to ROS (sandbox/lab_automation package). [Ilya]
-
Circles calibration template:
- The algorithm used for detecting regular grid of holes was adapted to the task of detecting a grid of black circles on white background.
- The algorithm uses a threshold-based blob detector followed by constructing a graph where neighboring blobs are connected, filtering out edges by clustering and then using graph matching to find the final solution.
- The calibration has been tested on two cameras, the reprojection error is smaller for the new calibration template (see the table below).
- The correct detection is 49 out of 50 samples, an example of detection is attached. [Ilya]
Canon Logitech
chessboard 0.83 0.23
circles 0.26 0.12
-
features2d:
- Added tests for some feature detectors: GridAdaptedFeatureDetector (“GridFAST”), PyramidAdaptedFeatureDetector (“PyramidFAST”).
- Added tests for some descriptor extractors: BriefDescriptorExtractor, OpponentColorDescriptorExtractor (“OpponentSIFT”, “OpponentSURF”).
- Added to existing tests of FeatureDetector, DescriptorExtractor, DescriptorMatcher the check of cases where input data is empty (they must not generate exceptions).
- Regenerated test data after fix of png decoder.
- Fixed BriefDescriptorExtractor (on rgb image), OpponentColorDescriptorExtractor, SIFT.
- Made createFeatureDetector, createDescriptorExtractor,… functions as static create() methods in the corresponding classes. Updated their implementations to create all supported detectors, descriptors, matchers.
- Updated doc on create(), added doc on BriefDescriptorExtractor. Maria
-
mll:
- Made random generators of mll classes dependent on default rng (theRNG) (#205) – they initialized before cvRng(-1) in many classes.
- Limited rng seeds in mll tests (dtree, rtrees, boost, ertrees) by 5 values to make their results check more stable. R
- estored these tests (#662, #460, 474, 505).
- Fixed bug in CvDTree breaking traincascade (#517). Maria
-
Other closed tickets:
- #419, 188, 200. Maria
Vadim
- 40 tickets have been closed:
- 172, 175, 266, 314, 325, 331, 351, 382, 392, 400, 407, 414, 425, 456, 457, 467, 511, 518, 538, 549, 563, 605, 611, 622, 658, 667, 697, 698, 701, 702, 703, 704, 705, 708, 712, 714, 715, 716, 717, 718,
- Installation scripts (parts of CMake scripts) for Windows & Unix have been updated for the new directory structure. The test installation package for Windows looks good and nothing seems to be missing in it.
- 17 OpenCV C samples have been converted to C++ and OpenCV # x API use.
- Chamfer matcher, written by Marius Muja and updated by Antonella Cascitelli, Marco Di Stefano and Stefano Fabri, has been integrated to OpenCV, contrib module.
- The change log, release notes and the installation guide are being updated now.
There are very good chances that OpenCV will be released today or tomorrow.
Anatoly Baksheev (GPU)
Accomplishments & Activities: 1) OpenCV GPU. · Debugging DemoCVPR and another our sample is in progress. To the moment we have not found clear repro case yet. · Added CUDA& NPP version check in source code and CMake scripts (findNPP cmake). We want to force user to install the latest libraries. · Analyzed sources of Kevin’s Stereo on Robot Benchmark, send him comment with explanations why GPU is slower than CPU. (most probably because of speckle filtering which is not run on CPU by default). Optimization StereoBM on GPU for lower resolutions was without success. 2) SURF & BFM. · Added 128-float support for SURF. Investigating differences between CPU and GPU implementation. · Added support for L1 & L2 norms for BFM. Added variant of find ‘n nearest neighbors’ function for train collection. · Some code reorganization. Found and fixed some bugs in knnMatch (storing distance, thread synchronization) and in radiusMatch (sorting results). 3) Joe's list. · Implemented countNonZero, minMaxLoc, minMax with and without masks for any types. Compared performance of no mask no location variant with NPP function. NPP is faster with variation 1.15x-1.8x. Trying to understand why. · Implemented Harris corner detector for floats. Going to add support of another types. Going to add support of border extrapolation – this will require updating of filters engine, and the update will be useful not only for Harris. Plans: 1) Continue debugging ‘megabug’. 2) Continue implementation of Joe’s list. 3) Integrate SURF into OpenCV GPU. 4) Begin Anton’s code integration.
Action Items
Gary
- Try out android install
- Try to copy in the cheatsheet in the docs
Alexander
Vadim
- Put in document stub for background subtraction
- Update coverage
- Update cheetsheet, add cout << cv::Mat stuff
- Post release, discuss panoramic API with Ethan
Victor
- Document use of PACAL VOC more
Kurt
Ethan
- Post OpenCV# 2 release, discuss with Vadim about pano API
From last time
Gary
- (./) Take a look at >> for cv::Mat — probably not worth doing
- ? More realistic test set
- :\ Send Ilya plate pictures
Alexander
- (./) Send objects that have type 1 false positives
Vadim
Victor
Kurt
Ethan
Agenda
- Release # 2 is out, any open issues
- Next priorities
- Documentation
- Algorithmic (Meta-theme: We want to start focusing on higher level functionality)
- Higher
- Panoramic stitching (Ethan)
- On a manifold (Shmuel Peleg type stuff ‘’search on “mosaic”’’)
-
Solutions in Perception Challenge
- TOD with scaling
- Read/write Train/Test/Report data in ROS→OpenCV
- 2D+3D solutions for textureless, transparent
- Parts/whole
- Face finding, tracking in 3D
- Primesense skeleton
- Planar objects stack
- ESM (Efficient Second order Minimization ESM, see second paper “Real-time image-based …” http://www-sop.inria.fr/icare/personnel/sbenhima/index.html
- New calibration
- Panoramic stitching (Ethan)
- Lower
- Stephan Holtzer’s closed contours
- Higher
- Gaps
- Dealing with lighting
- Finding intrinsic image
- Shape from shading
- Finding lighting sources
- Dealing with lighting
Minutes
- Release # 2
- Need to do rolling release scheme
- Need alternate installation packages that are posted
- Right now, we have a huge manual effort to get packages together
- Need to automate testing and install of packages
- Main priority until end of Jan for Vadim
- Need to do rolling release scheme
- Support for Kinect
-
ROS work is documented at http://www.ros.org/wiki/kinect_calibration/technical
- Patrick calibrated their RGB camera better with depth
- Was not done using epipolar search but 3D to 3D
- Their RGB camera is crappy
- Lens is OK
- But is rolling shutter and low dynamic range
- Uses a bad Debayering algorithm
- Could use multiple frames
- But this imager probably only has 6 bits of info, also it is rolling shutter
- Need to calibrate in another, better, camera
- 3rd camera will not be inline, but will sit on top
- Same thing that Patrick did will probably work
- What was done
- Take rgb camera, calibrated focal len and distortion
- did stereo calibration between rgb and ir cameras to get 6DOF transform between them
- with the 6DOF transform and the 3D points from ir camera
- then reproject rgb back onto the ir camera
- Have to think about de-mosaicing the Bayer pattern better — James used a more complex pattern that works better
-
Can you de-mosaic and rectify at the same time.
- _ De-mosaic to gray scale would work well_
- What was done
-
ROS work is documented at http://www.ros.org/wiki/kinect_calibration/technical
Vadim
- OpenCV # 2 has been finally released. The changelog is here:
- http://opencv.willowgarage.com/wiki/OpenCV%20Change%20Logs
- Except for a few glitches it works fine.
- If there are any serious problems discovered, we can release some minor update.
- For this purpose I created opencv/branches/# 2 instead of opencv/tags/# #
- In this branch we can put the fixes (2 fixes are already there).
- I’m now thinking of greatly extending the work scope of “continuous integration” server.
- I intentionally say “continuous integration” instead of “buildbot”, because after some internet search I find that most people prefer Hudson over buildbot. In particular, one guy said about buildbot:
“[buildbot is] … What everyone thinks they should use until they actually try to use it.” and he also said “…. just use Hudson” and there are a lot of similar opinions, even from Python lovers and Java haters (Buildbot is written in Python; Hudson is written in Java).
- We should migrate to Hudson. We should also:
- test not the raw code, but first construct installation packages, then unpack them, build and test. That will give us more solid testing of the final product.
- build and test CUDA.
- build and test samples. Yes, opencv samples can be automatically tested – we just need to put some tweaks to highgui.
- measure coverage
- test online documentation.
- That will bring us very close to the “rolling release” concept, i.e. we can put out OpenCV snapshots almost each day.
- And theoretically we can release OpenCV at any point of time.
- Another big topic is documentation.
- The script that checks the correctness and completeness of the docs is still in progress.
- I’m willing to spend some time on revising the documentation content and the representation.
- Porting to RST is one of the first things I would do.
Victor
Circle calibration pattern:
- the code for detecting the pattern has been plugged in into ROS calibration node and demo-ed on Friday show-and-tell.
- The detection is realtime and robust.
*This solution will be pushed into OpenCV trunk right after the release. [Victor] -
More test data
- for all PR2 cameras (narrow, wide stereo, prosilica and forearm) has been collected.
- The current algorithm does not handle large distortions, assuming that the grid consists of two groups of parallel edges.
- An improvement to this has been implemented that finds a subgrid, does homography rectification and then runs the algorithm again.
- The detection rate has been considerably improved (see below) [Ilya]
- for all PR2 cameras (narrow, wide stereo, prosilica and forearm) has been collected.
narrow_left wide_left r_forearm_cam Previous algorithm 0.69 0.56 0.30 Projective rectification 0.85 0.82 0.60 * _features2d:_ * Resolved the tickets on cascade training and MLL ##406, 554, 629. [*Maria*] _TOD:_ * Investigated the causes for run-to-run variation ** The root cause is the different order of results returned by flann. ** Another source of variation is the clustering aglorithm. * Crop refactoring is in progress. There are 4 different versions of crop now that are being integrated. [Alexander]
Alexander Baksheev (GPU)
Accomplishments:
1) Megabug.· Solved issue with crash in BP’s destructor (was bad compilation flag).
· Debugging and simplifying Stereo Demo. As result I know that the problem (kernel crash) is in OpenCV GPU StereoBM or in build process. Will continue investigation.
2) Joe’s function list.
· Implemented cornerHarris, cornerMinEigenVal (with support BORDER_REFLECT101 and BORDER_REPLICATE border types). Created a variant of linear filters with support of different border extrapolation types.
· Added some estimation of thread block configuration for minMax, minMaxLoc, countNonZero for different image sizes.
· Implemented first version of matchTemplate. It uses brute force algorithm and faster than CPU only for little templates. For big template sizes CPU outperforms GPU because of FFD-based template matching technique. Started reading papers about FFT-based 2D convolution. Going to use it and guess we will get good speed-up on GPU. Work on OpenCV FFT2D wrapper for CUFFT is also in process.
3) SUFR & Brute Force Mather and other.
· BFM optimization: for radiusMatch downloading part is ~2-4x faster now. Integrated BFM into opencv-gpu. Created test for it. Think it’s final.
· SURF::compute_descriptor128 optimizations: increased occupancy but speed up is not as expected. Investigating.
· Found and fixed bug in SURF.
· Found difference with CPU version: in ‘compute descriptor for user defined points’ mode feature orientation is not calculated, i.e. user need to calculate orientation itself. Fix is in process.
4) Other.
· Implemented per-element min, max functions.
· Added support of CV_8SC4, CV_16SC2, CV_16UC2, CV_32SC1 and CV_32FC1 for transpose . (NPP call for CV_8UC1).
Plans:
· Implementation of FFT-based matchTemplate.· Continue debugging ‘megabug’.
· Integrate SURF into opencv-gpu.
· Begin Anton’s code integration.
· Need to write documentation, or at least getting started and faq.
Gary
- Added dozens of output documentations so that now all .c and .cpp files tell what they do and how to run themselves.
- This documentation prints out when you run the sample
- Working on setting up an opencv foundation, stronger bonds with Android
- Would like to get some external people to do a good iPhone/iPad port as well.
Action Items
Gary
Alexander
Vadim
Victor
Kurt
- Send Vadim Bayer images from kinect camera
Ethan
From last time
Gary
- :\ Try out android install
- :\ Try to copy in the cheatsheet in the docs
Alexander
Vadim
- (?) Put in document stub for background subtraction
- (./) Update coverage
- (?) Update cheetsheet, add cout << cv::Mat stuff
- Post release, discuss panoramic API with Ethan
Victor
- :\ Document use of PACAL VOC more
Kurt
Ethan
- (./) Post OpenCV# 2 release, discuss with Vadim about pano API
Agenda
- Parser to aide interface to other languages, specifically Java. Samuel Audet has a Java C interface already.
- Release aftermath
- Spam on http://opencv.willowgarage.com/wiki/Welcome/People?action=diff&date=1287771459554675
Minutes
- Release
- 8 bugs
- Worst one is that for some Windows users, the camera won’t work. Some users find work around.
- Need to verify user fixes
- Worst one is that for some Windows users, the camera won’t work. Some users find work around.
- Anatoly found some problems with OpenCV demo code
- All the fixes going for now into https://code.ros.org/svn/opencv/branches/# 2
- 8 bugs
- Kinect bayer pattern has been improved.
- Seems like the Bayer image from the kinect is already pre-processed (not so well) so we can’t further improve this since the textures we might want are already gone.
- The open stuff for kinect is on http://www.openni.org/
- Or for ROS: http://www.ros.org/wiki/ni
- Let’s put out a sample code for people to use kinect with opencv
- We need to think about being able to work with point clouds and images
- PCL uses eigen to represent point clouds
- Maybe we should make this more transparent
- Inplace conversion back and forth
- Need to think of how to play better with PCL
- Can we go back and forth with no data copy
- Simple 3D viewer. There is a sample
Anatoly Baksheev (GPU)
-
Accomplishments:
- 1) StereoBM_GPU and stereo demo.
- · Found and fixed 2 bugs. Stereo demo is stable now! One of the bugs exists even in CVPR application, but does not show himself.
- · Going to rework StereoBN_GPU a bit, in order to make the code more understandable.
- 2) Experiments with stereo calibration. Going to do the last experiment tomorrow, and conclude this.
- 3) cv::matchTemplates.
- · Added cv::sum, cv::sqrSum, cv::columnSum for floats. Added cv::FFT wrapper that calls cuFFT.
- . Implemented 2D cross-correlation using FFT, and after block-based version of 2D cross-correlation (and hardcoded optimal blocks size after some experiments on Quadro). Compared results with brute force algorithm, relative error ~1e-# But will investigate more.
- · Added matchTemplate with CV_TM_CCORR methods using the FFT-based cross-correlation. Performance is 5×..20x (for different template sizes) of CPU version for CV_32F for Quadro.
- · Added matchTemplate with CV_TM_SQDIFF for CV_32F, CV_8U, and CV_TM_SQDIFF_NORMED for CV_8U, and CV_TM_CCORR_NORMED for CV_8U.
- · Switched ‘cv::integral’ images implementation from NPP to NPP_staging interface. Waiting version of ‘integrals’ for floats from Anton.
- 4) Other
- · Added support of 4 channels images to StereoBeliefPropagation and StereoConstantSpaceBP.
- · Fixed bug in GPU filter engine (incorrect buffer type) and in vector’s saturate_cast;
- · Added support of BORDER_REFLECT101, BORDER_REPLICATE and BORDER_CONSTANT border type to some GPU linear filters;
- · Answers on yahoo groups and some support beta testers support.
- 5) SURF. Checking another optimization idea, but spent only a little time for this.
- 1) StereoBM_GPU and stereo demo.
-
Plans:
- · Integrate NPP_staging into build system. And Anton’s face detection code.
- · Finish matchTemplate (normalized methods, color images support, finish quality testing, etc.)
- · Finish with SURF_GPU and integrate it to OpenCV.
- . Read one paper found in Internet about connected components on GPU. If there is a good idea then we can implement ‘cvFindContours’ and ‘speckle filtering’ on GPU!
- · Fulfill OpenCV GPU Wiki page with ‘Getting started’ and FAQ.
Victor
-
TOD:
- Refactoring of TrainingSet is in progress.
- Alexander was on a sick leave for the most of the previous week, refactoring should be completed this week. [Alexander]
-
Census:
- Experimented with Census and AD-Census measures for stereo correspondence problem.
- Tried Census measures with BM (my simple implementation without any post-processing like textureless regions and speckle filtering) and Belief Propagation algorithms (cv::gpu).
- Seems that Census is more accurate on object boundaries then SAD and Census BM disparity map can be used to further refinement.
- But Census results on Willow textured stereo images is worse than StereoBM (SAD).
- The latest winner in the Middlelbury benchmark also uses AD-Census (there is no link to the paper, only the following information is available:
- ’’"Anonymous. Accurate and efficient stereo matching with AD-census measure, cross-based regions and multi-step refinement. CVPR 2011 submission 1039")._
- Comparisons of BM with SAD and Census are attached below. Maria
- Experimented with Census and AD-Census measures for stereo correspondence problem.
-
Cones BM SAD:
!PastedGraphic-2 (1).png! -
Cones BM Census AD:
-
Textured image taken with head cart narrow stereo, SAD:
-
Census AD
-
Grid detection:
- Several improvements have been tried to make the algorithm work on severely distorted images of a circular grid template. One of bottlenecks is hierarchical clustering that fails to detect distorted grids and is sensitive to distance threshold. OPTICS (Ordering Points To Identify the Clustering Structure) algorithm has been implemented and evaluated as an alternative but it also fails on several templates. We will experiment with varying threshold parameters and post-filtering. [Ilya]
-
OpenCV:
- bug #737 closed (bad accuracy for algorithmic test on SIFT)
Vadim
Most of the week was spend on answering users’ questions about # 2 release and fixing various bugs, mostly minor.
- Dedicated branch https://code.ros.org/svn/opencv/branches/# 2 has been created that contains bug fixes and no new functionality (well, except for Bayer2Gray).
- 8 bugs in OpenCV # 2 have been fixed: ## 731, 734, 736, 740, 748, 749, 751, 755
- Some work on the Kinect project: bilinear algorithm for Bayer→RGB conversion has been optimized using SSE2 with ~25% speedup. Added dedicated function for direct Bayer→grayscale conversion, optimized it with SSE# Now the conversion takes ~0.6ms on 640×480 images, which is >4x faster than the original Bayer→RGB + RGB→grayscale conversion. Experimented a bit with noise filtering of the images from Kinect. The best results have been obtained with the bilateral filter, but the performance is quite low.
- Started studying the Hudson CI system; installed Tomcat & hudson on my laptop.
plans: put several users’ contributions into OpenCV; continue experiments with hudson.
Action Items
Gary
- Sample code for using kinect with opencv
- Talk with Radu about interfacing OpenCV and PCL
- Mainly point clouds back and forth
- Funding a developer
- OpenCV foundation, bring up with leadership meeting
Alexander
Vadim
Victor
- Sample code for using kinect with opencv
- Talk to Sergey about payment for GSoC
- Talk to Sergey about setting up the OpenCV foundation
Kurt
- Do we get the raw or processed Bayer image from kinect? If we had the whole raw image,we can perhaps do better.
Ethan
From last time
Gary
- Try out android install
- Try to copy in the cheatsheet in the docs
Alexander
Vadim
- Put in document stub for background subtraction
- Update coverage
- Update cheetsheet, add cout << cv::Mat stuff
- Post release, discuss panoramic API with Ethan
Victor
- Document use of PACAL VOC more
Kurt
Ethan
- Post OpenCV# 2 release, discuss with Vadim about pano API
Agenda
- Gary is on vacation, just some reports:
- Mainly, the effort to do a complete package create, install and test on the Hudson build system so that we can enable a
- Continuous release process. That is
- As soon a release OpenCV X.Y comes out, release OpenCV X.Y+1 will also come out, initially both are the same.
- As time goes on, we continue to add to OpenCV X.Y+# It is always an installable package.
- In time, OpenCV X.Y+1 will become the official release and OpenCV X.Y+2 will be put out at the same time, initially the same as OpenCV X.Y+1 and so on
- Continuous release process. That is
- Then,
- Getting OpenCV to work well with kinnect/prime sense
- Continuing refinement of Textured Object Detection (TOD)
- Work on a more robust calibration pattern
- And then some notes on the continuing evolution of the GPU version.
- Mainly, the effort to do a complete package create, install and test on the Hudson build system so that we can enable a
Minutes
Vadim
- Bug with video capturing on Windows has bee fixed – thanks to the users who pointed out the bug location.
- Experiments with Hudson continued. I successfully ran hudson on Linux and MacOSX. cross-machine and virtual machine-based builds are not implemented yet, as well as “matrix builds”, where various combinations of the build settings are tested.
- For the best results hudson wants the tests to report the result in the certain XML format; our test system can not do that.
- The possible solutions could be
- a) add the proper XML backend to our test system or
- b) migrate to some other test system that supports such reports.
- While the first method could probably be implemented faster, I’m now trying the second method to see what will it take. At once, I plan to standardize test methodology for OpenCV contributors. What has been done so far:
- google test has been extended with some OpenCV-specific functions and macros (for generating random matrices, comparing matrices etc.) – the license permits it. Such a modified engine has been put to opencv/modules/gtest.
- CMake scripts have been modified so that if inside opencv/modules// there is test subdirectory, the dedicated test executable is built, e.g. from opencv/modules/core/test we get “opencv_gtest_core” executable.
- Experimental tests for several OpenCV arithmetic functions (such as add, subtract, addWeighted, absdiff), have been written.
- The possible solutions could be
I think that opencv_core tests could be transformed to the google test system rather quickly. In parallel some tools and macros could be developed that will simplify migration of the other tests to the new test system. In the result we will have all the OpenCV tests decentralized and, after reorganization of the documentation build scripts, each module will become more or less independent, and some significant user’s contributions could be formed as opencv modules.
Victor
-
OpenCV integration with Kinect:
- ROS ni package works correctly, however, a sample based on openni library fails with some errors that are hard to diagnose. The current solution is to take ROS ni package and gradually convert it to opencv highgui. Maria
-
TOD: (Textured Object Detector)
- Most of the refactoring is done:
- ’’Loader_(creating a !TrainingSet instance out of a training base in a new format),
- ’’Matcher’’,
- ’’!ClusterBuilder classes_have been implemented.
- ’’findPoseGuesses_function has been implemented within the new architecture.
- Tod screenshots are attached (matching using Matcher class with !FlannBasedDescriptorMatcher, object’s cloud projection using a transformation found with !GuessesFinder class). [Alexander]
- Most of the refactoring is done:
-
Grid detector:
- Improved robustness of circles’ grid pattern detection by removing several parameters (e.g. maximum distance between an edge and a basis vector). The results are summarized in the table below:
narrow_left wide_left r_forearm_cam
Prev. version 0.85 0.82 0.60
Current version 0.91 0.83 0.71
- There are two main sources of misdetections:
- 1) Strong distortions and
- 2) Blob detector failures due to
- a) very bright illuminations or reflections of some circles;
- b) circles are very small when pattern is too far away from a camera.
- Integrated circles’ grid pattern detection in OpenCV trunk (calib3d module):
- commited the code,
- created documentation,
- modified achesscorners.cpp to test chessboard detection and circles’ grid detection,
- modified calibration.cpp sample to support both types of patterns.
- The final step for integration is to move several functions from one module to another in order to remove the dependency of features2d on calib3d. [Ilya]
- Improved robustness of circles’ grid pattern detection by removing several parameters (e.g. maximum distance between an edge and a basis vector). The results are summarized in the table below:
Anatoly Baksheev (GPU)
1) Surf. · Optimized SURF descriptor calculation part (+50% of speed). But total time was increased only slightly (~10%). Integrated SURF code into GPU module. · Victor suggested us to replace SURF + BFM parts in ROS visual odometery package with our GPU implementation and compare performance. This work is in progress, now learning the packages. 2) FD and NPP_staging. · Integrated NPP_staging in build system an tested it. · First adaptation of face detection code. Got compile errors on try to build opencv-gpu with FD code. Think need to rebuild NPP_staging with another compile flags. Will continue work on this. 3) matchTemplate · Implemented normalized methods, added support of multichannel images into gpu::matchTemplate. · Refactored and integrated in opencv (was unable to do it before due to dependency of NPP_staging) 4) Other · Finished experiments with stereo calibration. Conclusion was send to Victor. I don’t know if he forwarded it to Joe. · Learned the paper about connected components on GPU I mentioned earlier. This can’t help us to implement cv::findContours. Plans: · Continue work with Anton’s FD code. · Write getting started and FAQ. Begin GPU module documentation. This will attract many users. · Experiments with SURF + BFM + ROS.
Action Items
Gary
Alexander
Vadim
Victor
Kurt
Ethan
From last time
Gary
- Sample code for using kinect with opencv
- Talk with Radu about interfacing OpenCV and PCL
- Mainly point clouds back and forth
- Funding a developer
- OpenCV foundation, bring up with leadership meeting
Alexander
Vadim
Victor
- Sample code for using kinect with opencv
- Talk to Sergey about payment for GSoC
- Talk to Sergey about setting up the OpenCV foundation
Kurt
- Do we get the raw or processed Bayer image from kinect? If we had the whole raw image,we can perhaps do better.
Ethan
© Copyright 2024, OpenCV team
- Home
- Deep Learning in OpenCV
- Running OpenCV on Various Platforms
- OpenCV 5
- OpenCV 4
- OpenCV 3
- Development process
- OpenCV GSoC
- Archive