Welcome to the CoinNUMS database

Pan X and Tougne L (2017). A New Database of Digits Extracted from Coins with Hard-to-Segment Foreground for OCR Evaluation. Front. ICT 4:9. doi: 10.3389/fict.2017.00009




Character recognition using conventional OCR-based methods have achieved great success in applications such as reading
scanned documents or license plates. However, characters like those extracted from coins are special cases which haven’t
been much investigated by researchers. The main feature of characters found on coins: the foreground and the background
have the same properties in terms of color or texture. Under a doctoral research project ANRT 0807/2013, we constructed
the CoinNUMS database containing currently 3006 cropped digit images cropped from professional coin photos. We encourage
other researchers to propose algorithms to deal with such special characters and to report their results. The proposed
database contains three subsets : CoinNUMS_geni, CoinNUMS_pcgs_a, CoinNUMS_pcgs_m.


CoinNUMS_geni, extracted from professional coin photos of GENI,
  • Total number: 606
  • Classes: 10 (from 0 to 9)
  • Label: Class_Index
  • Source: GENI professional coin photos
  • Remarks: low noise level (mostly cropped from coins well conserved, manual cropping)
  • Samples :



CoinNUMS_pcgs_a, extracted from professional coin photos of PCGS,
  • Total number: 1200
  • Classes: 10 (from 0 to 9)
  • Label: Class_PositionInDate_CoinName
  • Source: USA_Grading (PCGS professional coin photos)
  • Remarks: high noise level (some cropped from degraded coins, automatic cropping with error)
  • Samples :



CoinNUMS_pcgs_m, extracted from professional coin photos of PCGS,
  • Total number: 1200
  • Classes: 10 (from 0 to 9)
  • Label: Class_PositionInDate_CoinName
  • Source: USA_Grading (PCGS professional coin photos)
  • Remarks: high noise level (some cropped from degraded coins, manual cropping)
  • Samples :



*For digit recognition, only the first character in the label, ground truth class, is useful.

Download

You may freely download this data for the research purpose by submitting your contact
(email, research institute, etc.). For your convenience of downloading, there are zipped versions of all the data in each directory.
The images are stored in RGB, JPG format.