Monday, January 27, 2020

pku-contest on kaggle, 105th place

I took part in pku autonomous driving competition on kaggle,



the task was to predict 6dof using only rgb image (3600x2700 resolution)
and 3d car model. Followed at first this paper [1]. Tried 6dvnet code and
also ApolloScape_InstanceSeg from author repository. 6dvnet code contained bugs
and didn't have good 6dof prediction (only box and mask were accurate).
ApolloScape_InstanceSeg contained interesting code but also gave not much good results.
Also tried [2] and [3]. Not much of success.
Adequate results I got from [4] using houglass model from github code. The
training was about 4 days on gtx 1070 using resized images (1280x720).
I improved much score by postprocessing regressed 6dof by grid-searching for
better translation trying to fit nicely into bbox. The main problems were
a) yaw estimation error (it frequently made errors by 180 degrees,
did not differentiate much front and back of a car)
b) it did not detect all cars (maskrcnn or faster rcnn made way better detection
of bboxes).

I had not got much time in the end to solve these problems, made some postprocessing
fixes to solve problem a only.

[1] 6D-VNet: End-To-End 6-DoF Vehicle Pose Estimation From Monocular RGB Images
[2] 3D Bounding Box Estimation Using Deep Learning and Geometry
[3] Orthographic Feature Transform for Monocular 3D Object Detection
[4] Objects as Points, Zhou

No comments:

Post a Comment