| Peer-Reviewed

Fast Image Embedded Chinese Text Extracting by Homogeneous Space Mapping

Received: 14 May 2017     Published: 16 May 2017
Views:       Downloads:
Abstract

Text-embedded images are popular in the mobile Internet to spread malicious information. A fast text-embedded image Chinese text extracting algorithm based on homogeneous space mapping is proposed. Image enhancement functions are used to highlight edge and texture features of images. Sobel operator is used to extract the edge feature and wavelet packet is used to extract the 24-dimensional texture feature vectors in the enhanced images. The texture features and edge features are used to describe the homogeneity of an image, which construct the homogeneous feature map of the image. The differences between the non-text and the text region homogeneity are used to distinguish them and reduce non-text region further. Thus the text regions are highlighted. Then, homogeneous text samples are used to train the text region detector, which greatly reduces the computational complexity. Finally, the characters are segmented and recognized. Some experiments to verify the validity and practicability of the proposed algorithm have been conducted. The recognition rate achieves 86%, which is higher than that of other methods in industry. The algorithm is verified on the operator's malicious information monitoring system, which provides secure malicious filtering performance.

Published in Journal of Electrical and Electronic Engineering (Volume 5, Issue 3)
DOI 10.11648/j.jeee.20170503.11
Page(s) 86-91
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2017. Published by Science Publishing Group

Keywords

Chinese Text Embedded Image, Homogeneous Mapping, Text Extraction, Information Security

References
[1] Yao Cong, Bai Xiang, Liu Wenyu, et al. Detecting texts of arbitrary orientation in natural images [C], 2012 IEEE conference on computer Vision and Patten Recognition (CVPR). IEEE, 2012:1083-1090.
[2] Epshtein Boris, Ofek Eyal, Wexler Yonatan. Detecting text in natural scenes wstroke width transform [C], 2010 IEEE conference on Computer Vision a Pattern Recognition (CVPR). IEEE, 2010:2963-2970.
[3] Li Xueyan, GuoShuxu, GaoFengli. Text extraction in video based on wavelet modulus maximum [J], Computer Engineering, 2007, 33(5): 26-28.
[4] Zhao Ming Li Shutao, Kwok James. Text detetion in images using sparse representation with discriminative dictionaries [J]. Image and Vision Computing, 2010, 28(12):1590-1599.
[5] Park Jonghyun, Lee Gueesang, Kim Euichul,et al. Automatic detection and recognition of Korean text in outdoor signboard image [J], Pattern Recognition Letters. 2010, 31(12):1728-1239.
[6] Ikica, Andrej, Peer, Peter. An improved edge profile based method for text detection in images of natural scenes [C] EUROCON-Intenational Conference on Computer as a Tool (EUROCON). IEEE, 2011:1-4.
[7] Yan Jianqiang, Tao Dacheng, Tian Cunna, Gao Xinbo, Li Xulong. Chinese Text Detection and Location for Images in Multimedia Messaging Service[C], //Proc of Systems Man and Cybernetics (SMC), 2010 IEEE International Conference on, Istanbul Turkey: IEEE Conference Publications,2010: 3896-3901.
[8] Zhong Yu, Zhang Hongjiang, Anil K. Jain. Automatic Caption Localization in Compressed Video [J], IEEE Transactions on Pattern Analysis and Machine Intelligence, 2000, 22(4):385-392.
[9] Lienhart R, Wernicke A. Localizing and Segmenting Text in Images and Videos[J], IEEE Transactions on Circuits and Systems for Video Technology, 2002,12(4):256-268.
[10] Wonjun K, Changick K. A New Approach for Overlay Text Detection and Extraction from Complex Video Scene [J], IEEE Transactions on Image Processing, 2009, 18(2):401-411.
[11] Pratheeba T, Kavitha V, Rajeswari S R. Morphology based Text Detection and Extraction from Complex Video Scene [J], International Journal of Engineering and Technology, 2010, 2(3): 200-206.
[12] LI Ying, ZHUANG Huaiyu, LI Xiangwei, A New Technique on Text Region Location in Images, Journal of Xidian University,2013,40(6):187-192.
[13] LI Ying, CUI Yan-peng, GAO Xin-bo, A Novel Algorithm for Text Data Compression Based on Arithmetic Codec, Journal of University of Electronic Science and Technology of China, 2016, 45(6): 929-933
[14] Huang Jianhua, Cheng Hengda, Wu Rui. Text Detection Method Based on Fuzzy Homogeneity Mapping [J]. Journal of Electronics and Information Technology, 2008, 30(6): 1376-1380.
Cite This Article
  • APA Style

    Li Ying, Liu Lisha, Cui Yan-peng, Zhuang Huaiyu. (2017). Fast Image Embedded Chinese Text Extracting by Homogeneous Space Mapping. Journal of Electrical and Electronic Engineering, 5(3), 86-91. https://doi.org/10.11648/j.jeee.20170503.11

    Copy | Download

    ACS Style

    Li Ying; Liu Lisha; Cui Yan-peng; Zhuang Huaiyu. Fast Image Embedded Chinese Text Extracting by Homogeneous Space Mapping. J. Electr. Electron. Eng. 2017, 5(3), 86-91. doi: 10.11648/j.jeee.20170503.11

    Copy | Download

    AMA Style

    Li Ying, Liu Lisha, Cui Yan-peng, Zhuang Huaiyu. Fast Image Embedded Chinese Text Extracting by Homogeneous Space Mapping. J Electr Electron Eng. 2017;5(3):86-91. doi: 10.11648/j.jeee.20170503.11

    Copy | Download

  • @article{10.11648/j.jeee.20170503.11,
      author = {Li Ying and Liu Lisha and Cui Yan-peng and Zhuang Huaiyu},
      title = {Fast Image Embedded Chinese Text Extracting by Homogeneous Space Mapping},
      journal = {Journal of Electrical and Electronic Engineering},
      volume = {5},
      number = {3},
      pages = {86-91},
      doi = {10.11648/j.jeee.20170503.11},
      url = {https://doi.org/10.11648/j.jeee.20170503.11},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.jeee.20170503.11},
      abstract = {Text-embedded images are popular in the mobile Internet to spread malicious information. A fast text-embedded image Chinese text extracting algorithm based on homogeneous space mapping is proposed. Image enhancement functions are used to highlight edge and texture features of images. Sobel operator is used to extract the edge feature and wavelet packet is used to extract the 24-dimensional texture feature vectors in the enhanced images. The texture features and edge features are used to describe the homogeneity of an image, which construct the homogeneous feature map of the image. The differences between the non-text and the text region homogeneity are used to distinguish them and reduce non-text region further. Thus the text regions are highlighted. Then, homogeneous text samples are used to train the text region detector, which greatly reduces the computational complexity. Finally, the characters are segmented and recognized. Some experiments to verify the validity and practicability of the proposed algorithm have been conducted. The recognition rate achieves 86%, which is higher than that of other methods in industry. The algorithm is verified on the operator's malicious information monitoring system, which provides secure malicious filtering performance.},
     year = {2017}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Fast Image Embedded Chinese Text Extracting by Homogeneous Space Mapping
    AU  - Li Ying
    AU  - Liu Lisha
    AU  - Cui Yan-peng
    AU  - Zhuang Huaiyu
    Y1  - 2017/05/16
    PY  - 2017
    N1  - https://doi.org/10.11648/j.jeee.20170503.11
    DO  - 10.11648/j.jeee.20170503.11
    T2  - Journal of Electrical and Electronic Engineering
    JF  - Journal of Electrical and Electronic Engineering
    JO  - Journal of Electrical and Electronic Engineering
    SP  - 86
    EP  - 91
    PB  - Science Publishing Group
    SN  - 2329-1605
    UR  - https://doi.org/10.11648/j.jeee.20170503.11
    AB  - Text-embedded images are popular in the mobile Internet to spread malicious information. A fast text-embedded image Chinese text extracting algorithm based on homogeneous space mapping is proposed. Image enhancement functions are used to highlight edge and texture features of images. Sobel operator is used to extract the edge feature and wavelet packet is used to extract the 24-dimensional texture feature vectors in the enhanced images. The texture features and edge features are used to describe the homogeneity of an image, which construct the homogeneous feature map of the image. The differences between the non-text and the text region homogeneity are used to distinguish them and reduce non-text region further. Thus the text regions are highlighted. Then, homogeneous text samples are used to train the text region detector, which greatly reduces the computational complexity. Finally, the characters are segmented and recognized. Some experiments to verify the validity and practicability of the proposed algorithm have been conducted. The recognition rate achieves 86%, which is higher than that of other methods in industry. The algorithm is verified on the operator's malicious information monitoring system, which provides secure malicious filtering performance.
    VL  - 5
    IS  - 3
    ER  - 

    Copy | Download

Author Information
  • School of Electronic Engineering, Xidian Univercity, Xi'an, China

  • School of Electronic Engineering, Xidian Univercity, Xi'an, China

  • Institute for Internet Behavior, Xidian Univercity, Xi'an, China

  • China Mobile Group Guangdong Co.,Ltd. Guangzhou, China

  • Sections