lists.arthurdejong.org
RSS feed

python-stdnum branch master updated. 1.13-29-g0d5b8b1

[Date Prev][Date Next] [Thread Prev][Thread Next]

python-stdnum branch master updated. 1.13-29-g0d5b8b1



This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "python-stdnum".

The branch, master has been updated
       via  0d5b8b154945cc3cdcea5ddfd0676332b4cf5ad4 (commit)
      from  4eda3f3535d28e2486745f33504c417ba6837c3a (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://arthurdejong.org/git/python-stdnum/commit/?id=0d5b8b154945cc3cdcea5ddfd0676332b4cf5ad4

commit 0d5b8b154945cc3cdcea5ddfd0676332b4cf5ad4
Author: Leandro Regueiro <leandro.regueiro@gmail.com>
Date:   Wed Mar 18 20:45:35 2020 +0100

    Add support for Singapore Unique Entity Number
    
    Closes https://github.com/arthurdejong/python-stdnum/issues/111
    Closes https://github.com/arthurdejong/python-stdnum/pull/203

diff --git a/stdnum/sg/__init__.py b/stdnum/sg/__init__.py
new file mode 100644
index 0000000..1d3e79d
--- /dev/null
+++ b/stdnum/sg/__init__.py
@@ -0,0 +1,24 @@
+# __init__.py - collection of Singapore numbers
+# coding: utf-8
+#
+# Copyright (C) 2020 Leandro Regueiro
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+# 02110-1301 USA
+
+"""Collection of Singapore numbers."""
+
+# provide aliases
+from stdnum.sg import uen as vat  # noqa: F401
diff --git a/stdnum/sg/uen.py b/stdnum/sg/uen.py
new file mode 100644
index 0000000..2b49be1
--- /dev/null
+++ b/stdnum/sg/uen.py
@@ -0,0 +1,172 @@
+# uen.py - functions for handling Singapore UEN numbers
+# coding: utf-8
+#
+# Copyright (C) 2020 Leandro Regueiro
+# Copyright (C) 2020 Arthur de Jong
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+# 02110-1301 USA
+
+"""UEN (Singapore's Unique Entity Number).
+
+The Unique Entity Number (UEN) is a 9 or 10 digit identification issued by
+the government of Singapore to businesses that operate with within Singapore.
+
+
+Accounting and Corporate Regulatory Authority (ACRA)
+
+There are three different formats:
+
+* Business (ROB): It consists of 8 digits followed by a check letter.
+* Local Company (ROC): It consists of 9 digits (the 4 leftmost digits
+  represent the year of issuance) followed by a check letter.
+* Others: Consists of 10 characters, begins with either the R letter, or the
+  S letter or the T letter followed by 2 digits representing the last two
+  digits of the issuance year, followed by two letters representing the
+  entity type, 4 digits and finally a check letter.
+
+More information:
+
+* 
https://www.oecd.org/tax/automatic-exchange/crs-implementation-and-assistance/tax-identification-numbers/Singapore-TIN.pdf
+* https://www.uen.gov.sg/ueninternet/faces/pages/admin/aboutUEN.jspx
+
+>>> validate('00192200M')
+'00192200M'
+>>> validate('197401143C')
+'197401143C'
+>>> validate('S16FC0121D')
+'S16FC0121D'
+>>> validate('T01FC6132D')
+'T01FC6132D'
+>>> validate('123456')
+Traceback (most recent call last):
+    ...
+InvalidLength: ...
+"""
+
+# There are some references to special 10-digit (or 7-digit) numbers that
+# start with an F for foreign companies but it is unclear whether this is
+# still current and not even examples of these numbers could be found.
+
+from datetime import datetime
+
+from stdnum.exceptions import *
+from stdnum.util import clean, isdigits
+
+
+OTHER_UEN_ENTITY_TYPES = (
+    'CC', 'CD', 'CH', 'CL', 'CM', 'CP', 'CS', 'CX', 'DP', 'FB', 'FC', 'FM',
+    'FN', 'GA', 'GB', 'GS', 'HS', 'LL', 'LP', 'MB', 'MC', 'MD', 'MH', 'MM',
+    'MQ', 'NB', 'NR', 'PA', 'PB', 'PF', 'RF', 'RP', 'SM', 'SS', 'TC', 'TU',
+    'VH', 'XL',
+)
+
+
+def compact(number):
+    """Convert the number to the minimal representation.
+
+    This converts to uppercase and removes surrounding whitespace. It
+    also replaces the whitespace in UEN for foreign companies with
+    zeroes.
+    """
+    return clean(number).upper().strip()
+
+
+def calc_business_check_digit(number):
+    """Calculate the check digit for the Business (ROB) number."""
+    number = compact(number)
+    weights = (10, 4, 9, 3, 8, 2, 7, 1)
+    return 'XMKECAWLJDB'[sum(int(n) * w for n, w in zip(number, weights)) % 11]
+
+
+def _validate_business(number):
+    """Perform validation on UEN - Business (ROB) numbers."""
+    if not isdigits(number[:-1]):
+        raise InvalidFormat()
+    if not number[-1].isalpha():
+        raise InvalidFormat()
+    if number[-1] != calc_business_check_digit(number):
+        raise InvalidChecksum()
+    return number
+
+
+def calc_local_company_check_digit(number):
+    """Calculate the check digit for the Local Company (ROC) number."""
+    number = compact(number)
+    weights = (10, 8, 6, 4, 9, 7, 5, 3, 1)
+    return 'ZKCMDNERGWH'[sum(int(n) * w for n, w in zip(number, weights)) % 11]
+
+
+def _validate_local_company(number):
+    """Perform validation on UEN - Local Company (ROC) numbers."""
+    if not isdigits(number[:-1]):
+        raise InvalidFormat()
+    current_year = str(datetime.now().year)
+    if number[:4] > current_year:
+        raise InvalidComponent()
+    if number[-1] != calc_local_company_check_digit(number):
+        raise InvalidChecksum()
+    return number
+
+
+def calc_other_check_digit(number):
+    """Calculate the check digit for the other entities number."""
+    number = compact(number)
+    alphabet = 'ABCDEFGHJKLMNPQRSTUVWX0123456789'
+    weights = (4, 3, 5, 3, 10, 2, 2, 5, 7)
+    return alphabet[(sum(alphabet.index(n) * w for n, w in zip(number, 
weights)) - 5) % 11]
+
+
+def _validate_other(number):
+    """Perform validation on other UEN numbers."""
+    if number[0] not in ('R', 'S', 'T'):
+        raise InvalidComponent()
+    if not isdigits(number[1:3]):
+        raise InvalidFormat()
+    current_year = str(datetime.now().year)
+    if number[0] == 'T' and number[1:3] > current_year[2:]:
+        raise InvalidComponent()
+    if number[3:5] not in OTHER_UEN_ENTITY_TYPES:
+        raise InvalidComponent()
+    if not isdigits(number[5:-1]):
+        raise InvalidFormat()
+    if number[-1] != calc_other_check_digit(number):
+        raise InvalidChecksum()
+    return number
+
+
+def validate(number):
+    """Check if the number is a valid Singapore UEN number."""
+    number = compact(number)
+    if len(number) not in (9, 10):
+        raise InvalidLength()
+    if len(number) == 9:
+        return _validate_business(number)
+    if isdigits(number[0]):
+        return _validate_local_company(number)
+    return _validate_other(number)
+
+
+def is_valid(number):
+    """Check if the number is a valid Singapore UEN number."""
+    try:
+        return bool(validate(number))
+    except ValidationError:
+        return False
+
+
+def format(number):
+    """Reformat the number to the standard presentation format."""
+    return compact(number)
diff --git a/tests/test_sg_uen.doctest b/tests/test_sg_uen.doctest
new file mode 100644
index 0000000..caa2d0a
--- /dev/null
+++ b/tests/test_sg_uen.doctest
@@ -0,0 +1,301 @@
+test_sg_uen.doctest - more detailed doctests for stdnum.sg.uen module
+
+Copyright (C) 2020 Leandro Regueiro
+
+This library is free software; you can redistribute it and/or
+modify it under the terms of the GNU Lesser General Public
+License as published by the Free Software Foundation; either
+version 2.1 of the License, or (at your option) any later version.
+
+This library is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+Lesser General Public License for more details.
+
+You should have received a copy of the GNU Lesser General Public
+License along with this library; if not, write to the Free Software
+Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA
+
+
+This file contains more detailed doctests for the stdnum.sg.uen module. It
+tries to test more corner cases and detailed functionality that is not really
+useful as module documentation.
+
+>>> from stdnum.sg import uen
+
+
+Tests for some corner cases.
+
+>>> uen.validate('00192200M')
+'00192200M'
+>>> uen.validate('00192200C')
+Traceback (most recent call last):
+    ...
+InvalidChecksum: ...
+>>> uen.validate('197401143C')
+'197401143C'
+>>> uen.validate('S16FC0121D')
+'S16FC0121D'
+>>> uen.validate('T01FC6132D')
+'T01FC6132D'
+>>> uen.format(' 00192200M ')
+'00192200M'
+>>> uen.validate('123456')
+Traceback (most recent call last):
+    ...
+InvalidLength: ...
+>>> uen.validate('R2345678H')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+>>> uen.validate('123456789')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+>>> uen.validate('1R3456789H')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+>>> uen.validate('999956789H')
+Traceback (most recent call last):
+    ...
+InvalidComponent: ...
+>>> uen.validate('1234567890')
+Traceback (most recent call last):
+    ...
+InvalidChecksum: ...
+>>> uen.validate('W23LL6789H')
+Traceback (most recent call last):
+    ...
+InvalidComponent: ...
+>>> uen.validate('S2WLL6789H')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+>>> uen.validate('T99LL6789H')
+Traceback (most recent call last):
+    ...
+InvalidComponent: ...
+>>> uen.validate('T02WW6789H')
+Traceback (most recent call last):
+    ...
+InvalidComponent: ...
+>>> uen.validate('T02LL6W89H')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+>>> uen.validate('T02LL67890')
+Traceback (most recent call last):
+    ...
+InvalidChecksum: ...
+
+
+These have been found online and should all be valid numbers.
+
+>>> numbers = '''
+...
+... 00416700K
+... 01402200A
+... 03122200E
+... 03989700A
+... 05040200E
+... 05328700B
+... 05346600D
+... 05682800D
+... 05796700L
+... 06108900A
+... 06273700K
+... 06540200J
+... 06905300J
+... 07059900M
+... 07088800X
+... 07330000D
+... 07392800C
+... 07609900L
+... 08301600K
+... 09402000X
+... 10080700C
+... 10185500A
+... 10362000A
+... 10415300L
+... 10482100X
+... 10528100W
+... 10655900A
+... 10839500J
+... 198101793G
+... 199201624D
+... 199409389H
+... 199607747H
+... 199903512M
+... 200001838R
+... 200003956G
+... 200311327H
+... 200402245Z
+... 200509725E
+... 200601141M
+... 200612239R
+... 200613692K
+... 200806526H
+... 200923096R
+... 201001206N
+... 201026348Z
+... 201107298M
+... 201118211H
+... 201221002E
+... 201225997K
+... 201227749M
+... 201312700G
+... 201316157E
+... 201405619W
+... 201421015W
+... 201422211Z
+... 201427857E
+... 201430557E
+... 201434292D
+... 201505714C
+... 201506999D
+... 201507276Z
+... 201509563K
+... 201524437R
+... 201528593H
+... 201530032R
+... 201533374C
+... 201538146W
+... 201539125W
+... 201539692G
+... 201541306Z
+... 201605323H
+... 201608874E
+... 201612228W
+... 201613871H
+... 201616811C
+... 201620388N
+... 201621244C
+... 201626104W
+... 201630570C
+... 201630906R
+... 201703509H
+... 201709179W
+... 201723655Z
+... 201728348W
+... 201729145M
+... 201730487Z
+... 201732074H
+... 201801863N
+... 201810763C
+... 201813990E
+... 201814325Z
+... 201819215N
+... 201828636K
+... 201831257M
+... 201831267E
+... 201902616N
+... 201927572K
+... 201931750M
+... 201933247R
+... 53143018M
+... 53321041X
+... 53322268X
+... 53325701L
+... 53327223L
+... 53328294B
+... 53329644K
+... 53329865B
+... 53333269M
+... 53333709K
+... 53334750E
+... 53336671B
+... 53337313A
+... 53337959L
+... 53338455M
+... 53338593L
+... 53339826J
+... 53340108M
+... 53340486W
+... 53341233A
+... 53343978K
+... 53344927L
+... 53346524M
+... 53346603B
+... 53346756W
+... 53351301D
+... 53354912D
+... 53357813B
+... 53358098J
+... 53358936C
+... 53359192M
+... 53359796W
+... 53360148K
+... 53360367J
+... 53360906B
+... 53365157C
+... 53366537W
+... 53366669D
+... 53368090E
+... 53369142K
+... 53371741L
+... 53372084B
+... 53372845B
+... 53372960B
+... 53373278J
+... 53373376J
+... 53374181E
+... 53374421X
+... 53377389W
+... 53377498E
+... 53377824K
+... 53380578C
+... 53382783C
+... 53383991B
+... 53384227A
+... 53393038C
+... 53393390X
+... 53393519M
+... 53394981E
+... 53395684J
+... 53395867M
+... 53397681B
+... 53398169K
+... 53399589B
+... 53399638D
+... 53399659K
+... S16FC0121D
+... S27FC0556D
+... S64FC1644H
+... S65SS0033F
+... S66SS0041B
+... S68FC1890G
+... S85FC3621C
+... S99FC5759D
+... T01FC6132D
+... T07LL0309A
+... T08LL0003B
+... T08LL0005E
+... T08LL0721A
+... T08LL0979J
+... T10LL1392L
+... T10LL1400J
+... T11LL0668C
+... T11LL1149C
+... T11LL1971L
+... T11LL2079G
+... T12LL0781D
+... T12LL2046J
+... T12LL2127H
+... T14FC0094H
+... T14LL1800D
+... T15LL1753J
+... T15LL1855C
+... T15LL1956A
+... T16LL0695L
+... T16LL1853E
+... T17LL0235L
+... T17LL1494G
+... T17LL2098G
+... T18FC0083E
+... T19LL0797J
+...
+... '''
+>>> [x for x in numbers.splitlines() if x and not uen.is_valid(x)]
+[]

-----------------------------------------------------------------------

Summary of changes:
 stdnum/{uy => sg}/__init__.py |   8 +-
 stdnum/sg/uen.py              | 172 ++++++++++++++++++++++++
 tests/test_sg_uen.doctest     | 301 ++++++++++++++++++++++++++++++++++++++++++
 3 files changed, 477 insertions(+), 4 deletions(-)
 copy stdnum/{uy => sg}/__init__.py (81%)
 create mode 100644 stdnum/sg/uen.py
 create mode 100644 tests/test_sg_uen.doctest


hooks/post-receive
-- 
python-stdnum