






Fast- RCNN :

1) Convolution (96 , 4, 11, 5),

2) ReLU,

3) Pooling ( 2, 3, 1),

4) LRN ( 4, 0.001, 0.75),

5) Convolution (256 , 1, 5, 2),

6) ReLU,

7) Pooling ( 2, 3, 1),

8) LRN ( 4, 0.001, 0.75),

9) Convolution (384 , 1, 3, 1),

10) ReLU,

11) Convolution (384 , 1, 3, 1),

12) ReLU,

13) Convolution (256 , 1, 3, 1),

14) ReLU,

15) RoI Data Layer ( = = 6, . = 1/16),

16) INNER PRODUCT (4096 ),

17) ReLU,

18) DropOut ( = 0.5),

18) INNER PRODUCT (FC8 21 ),

19) ReLU,

20) INNER PRODUCT (FC9 84 ),

21) Location Loss,

22) Confidence Loss.

ImageNet 2014 , FC8 FC9:

FC8 21 201 (- + . ),

FC9 84 804 ( 4 ).


à min


, , , , .

(mini-batch). batch size 2.

PASCAL VOC 2012 ~ 2.5 .

mAP mean average precision ~ 68.4%.


3. mAP Pascal VOC 2012

aero bike bird boat bottle bus car cat chair cow table
Fast R-CNN 82.3 78.4 70.8 52.3 38.7 77.8 71.6 89.3 44.2    

4. mAP Pascal VOC 2012

dog horse mbike person plant sheep sofa train tv
Fast R-CNN 87.5 80.5 80.8   35.1 68.3 65.7 80.4 64.2


Multi-Box, mAP ~ 19% (Multi-Box mAP ~ 49.5%).


. 13. PASCAL VOC 2012

ImageNet 2014 :

. 14. ImageNet 2014

, Google:





Fast R-CNN ImageNet 2014. 1 . Multi-Box (mAP 0.43 ImageNet 2014), Fast R-CNN (mAP 0.56 ImageNet 2014).

, , .

, , . , , .

, .

, .


1. Bar-Hen Ron, Johanan Erez. A Real-time vehicle License Plate Recognition (LPR). // VISL, Technion, 2003.

2. R. Cox. QArt Coder. // 2012.

3. Y. Sun, Y. Chen, X. Wang, and X. Tang. Deep learning face representation by joint identification-verification. In Proc. NIPS, 2014.

4. M. o Bertozzi, A. o Broggi, A. a Fascioli. Vision-based intelligent vehicles: State of the art and perspectives. Robotics and Autonomous systems., 32(1):116, 2000.

5. LeCun, Y. Learning algorithms for classification: A comparison on handwritten digit recognition. / Y. LeCun, L. Jackel, L. Bottou, C. Cortes, J. S. Denker, H. Drucker, I. Guyon, U. Muller, E. Sackinger, P. Simard, et al. // Neural networks: the statistical mechanics perspective, 261:276. 1995.

6. Fukushima, K. Neural network model for a mechanism of pattern recognition unaffected by shift in position- neocognitron. / K. Fukushima. // ELECTRON. & COMMUN. JAPAN, 62(10):1118. 1979.

7. Chatfield, K. Return of the devil in the details: Delving deep into convolutional nets. / K. Chatfield, K. Simonyan, A. Vedaldi, A. Zisserman. // BMVC. 2014.

8. Krizhevsky, A. Imagenet classification with deep convolutional neural networks. / A. Krizhevsky, I. Sutskever, G. E. Hinton. // Advances in neural information processing systems. 2012. pp. 10971105.

9. Felzenszwalb, P.F. Object detection with discriminatively trained partbased models. / P. F. Felzenszwalb, R. B. Girshick, D. McAllester, D. Ramanan. // Pattern Analysis and Machine Intelligence, IEEE Transactions on, 32(9):16271645. 2010.

10. Carreira, J. Semantic segmentation with second-order pooling. / J. Carreira, R. Caseiro, J. Batista, C. Sminchisescu. // ECCV. 2012.

11. Kristan, M. The visual object tracking VOT2014 challenge results. / M. Kristan // European Conference on Computer Vision Workshop. 2014.

12. Hong, S. Online tracking by learning discriminative saliency map with convolutional neural network. / S. Hong, T. You, S. Kwak, B. Han. // arXiv preprint arXiv:1502.06796. 2015.

13. Seitz, S. A Comparison and Evaluation of Multi-View Stereo Reconstruction Algorithms. / S. Seitz, B. Curless, J. Diebel, D, Scharstein, R. Szeliski. // CVPR 2006, vol. 1. 2006. - pp. 519-526.

14. Alvarez, L. Image selective smoothing and edge detection by non- linear diffusion (II). / L. Alvarez, P-L. Lions, J-M. Morel. // SIAM Journal of numerical analysis 29. 1992. - pp. 845-866.

15. Samuel, A.L. Some studies in Machine Learning Using the Game of Checkers. / A.L. Samuel. 1959.

16. Gurney, K. An Introduction to Neural Networks London. / K. Gurney. // Routledge. 1997.

17. Caruana, R. Multitask learning. / R. Caruana. // Machine learning. 1997.

18. A. Graves and J. Schmidhuber. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks, 18:602610, 2005.

19. J. Martens, I. Sutskever. Training deep and recurrent networks with hessian-free optimization. In Neural Networks: Tricks of the Trade, pp. 479-535. Springer Berlin Heidelberg, 2012.

20. Girshick R. Rich feature hierarchies for accurate object detection and semantic segmentation. / R. Girshick, J. Donahue, T. Darrell, J. Malik. // Proceedings of the IEEE Conference on Comp.

21. Erhan, D. Scalable object detection using deep neural networks. / D. Erhan, C. Szegedy, A. Toshev, D. Anguelov. // Computer Vision and Pattern Recognition (CVPR), 2014 IEEE Conference on. IEEE. 2014. pp. 21552162. 1, 2, 3, 6.

22. Girshick R. Fast R-CNN // 2015

23. C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich. Going deeper with convolutions. arXiv preprint arXiv:1409.4842, 2014.

24. Min Lin, Qiang Chen, and Shuicheng Yan. Network in network. CoRR, abs/1312.4400, 2013.

25. Koen E. A. van de Sande, Jasper R. R. Uijlings, Theo Gevers, and Arnold W. M. Smeulders. Segmentation as selective search for object recognition. In Proceedings of the 2011 International Conference on Computer Vision, ICCV 11, pages 18791886, Washington, DC, USA, 2011. IEEE Computer Society.

26. C. Szegedy, S. Reed, D. Anguelov, D. Erhan. Scalable, High-Quality Object Detection. arXiv preprint arXiv: 1412.1441v2, 2015.

27. He, K. Spatial pyramid pooling in deep convolutional networks for visual recognition. / K. He, X. Zhang, S. Ren, J. Sun. // Computer Vision ECCV 2014. Springer. 2014. pp. 346361.

28. ImageNet 2014 [ ] : h ttp://www.image-net.org/challenges/LSVRC/2014 / ( : 07.03.2015).

29. PASCAL VOC [ ] : http://pascallin.ecs.soton.ac.uk/ ( : 24.12.2014).

30. Caffe [ ] : http://caffe.berkeleyvision.org/ ( : 15.03.2015).

31. Bottou, L. Stochastic Gradient Descent Tricks. Neural Networks: Tricks of the Trade. / L. Bottou. // Springer. - 2012.

32. Nesterov, Y. A Method of Solving a Convex Programming Problem with Convergence Rate O ). / Y. Nesterov. // Soviet Mathematics Doklady. - 1983

33. Duchi, E. Hazan, and Y. Singer. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization. The Journal of Machine Learning Research, 2011.

34. I. Sutskever, J. Martens, G. Dahl, and G. Hinton. On the Importance of Initialization and Momentum in Deep Learning. Proceedings of the 30th International Conference on Machine Learning, 2013.

35. Hosang, J. How good are detection proposals, really? / J. Hosang, R. Benenson, B. Schiele. // arXiv preprint arXiv:1406.6962. 2014.

36. Krahenbuhl, P. Geodesic object proposals. / P. Krahenbuhl, V. Koltun. // Computer VisionECCV 2014. Springer. 2014. pp. 725739.

37. Manen, S. Prime object proposals with randomized prims algorithm. / S. Manen, M. Guillaumin, L. V. Gool. // Computer Vision (ICCV), 2013 IEEE International Conference on. IEEE. 2013. pp. 25362543.

38. Ouyang, V. Deepid-net: multi-stage and deformable deep convolutional neural networks for object detection. / W. Ouyang, P. Luo, X. Zeng, S. Qiu, Y. Tian, H. Li, S. Yang, Z. Wang, Y. Xiong, C. Qian, et al. // arXiv preprint arXiv:1409.3505. 2014.

39. Russakovsky, O. Imagenet large scale visual recognition challenge. / O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, et al. // 2014.




: 2016-09-06; !; : 552 |



: , .
==> ...

1460 - | 1434 -

© 2015-2024 lektsii.org - -

: 0.085 .