lists.arthurdejong.org
RSS feed

python-stdnum branch master updated. 1.7-18-gbafdb70

[Date Prev][Date Next] [Thread Prev][Thread Next]

python-stdnum branch master updated. 1.7-18-gbafdb70



This is an automated email from the git hooks/post-receive script. It was
generated because a ref change was pushed to the repository containing
the project "python-stdnum".

The branch, master has been updated
       via  bafdb70b54262633d9c4bb865ba8efbc4b1fdaa9 (commit)
      from  d5f97e98150ca9a0fe84d929f7900ce40f015891 (commit)

Those revisions listed above that are new to this repository have
not appeared on any other notification email; so we list those
revisions in full, below.

- Log -----------------------------------------------------------------
https://arthurdejong.org/git/python-stdnum/commit/?id=bafdb70b54262633d9c4bb865ba8efbc4b1fdaa9

commit bafdb70b54262633d9c4bb865ba8efbc4b1fdaa9
Author: Arthur de Jong <arthur@arthurdejong.org>
Date:   Sun Nov 26 21:51:48 2017 +0100

    Add CAS Registry Number
    
    This adds validation of the Chemical Abstracts Service Registry Number.

diff --git a/stdnum/casrn.py b/stdnum/casrn.py
new file mode 100644
index 0000000..de181b8
--- /dev/null
+++ b/stdnum/casrn.py
@@ -0,0 +1,76 @@
+# casrn.py - functions for handling CAS Registry Numbers
+#
+# Copyright (C) 2017 Arthur de Jong
+#
+# This library is free software; you can redistribute it and/or
+# modify it under the terms of the GNU Lesser General Public
+# License as published by the Free Software Foundation; either
+# version 2.1 of the License, or (at your option) any later version.
+#
+# This library is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+# Lesser General Public License for more details.
+#
+# You should have received a copy of the GNU Lesser General Public
+# License along with this library; if not, write to the Free Software
+# Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+# 02110-1301 USA
+
+"""CAS RN (Chemical Abstracts Service Registry Number).
+
+The CAS Registry Number is a unique identifier assigned by the Chemical
+Abstracts Service (CAS) to a chemical substance.
+
+More information:
+
+* https://en.wikipedia.org/wiki/CAS_Registry_Number
+
+>>> validate('87-86-5')
+'87-86-5'
+>>> validate('87-86-6')
+Traceback (most recent call last):
+    ...
+InvalidChecksum: ...
+"""
+
+from stdnum.exceptions import *
+from stdnum.util import clean
+
+
+def compact(number):
+    """Convert the number to the minimal representation."""
+    number = clean(number, ' ').strip()
+    if '-' not in number:
+        number = '-'.join((number[:-3], number[-3:-1], number[-1:]))
+    return number
+
+
+def calc_check_digit(number):
+    """Calculate the check digit for the number. The passed number should not
+    have the check digit included."""
+    number = number.replace('-', '')
+    return str(
+        sum((i + 1) * int(n) for i, n in enumerate(reversed(number))) % 10)
+
+
+def validate(number):
+    """Check if the number provided is a valid CAS RN."""
+    number = compact(number)
+    if not 7 <= len(number) <= 12:
+        raise InvalidLength()
+    if not number[:-5].isdigit() or not number[-4:-2].isdigit():
+        raise InvalidFormat()
+    if number[-2] != '-' or number[-5] != '-':
+        raise InvalidFormat()
+    if number[-1] != calc_check_digit(number[:-1]):
+        raise InvalidChecksum()
+    return number
+
+
+def is_valid(number):
+    """Check if the number provided is a valid CAS RN."""
+    try:
+        return bool(validate(number))
+    except ValidationError:
+        return False
diff --git a/tests/test_casrn.doctest b/tests/test_casrn.doctest
new file mode 100644
index 0000000..3ba6906
--- /dev/null
+++ b/tests/test_casrn.doctest
@@ -0,0 +1,105 @@
+test_casrn.doctest - more detailed doctests for the stdnum.casrn module
+
+Copyright (C) 2017 Arthur de Jong
+
+This library is free software; you can redistribute it and/or
+modify it under the terms of the GNU Lesser General Public
+License as published by the Free Software Foundation; either
+version 2.1 of the License, or (at your option) any later version.
+
+This library is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU
+Lesser General Public License for more details.
+
+You should have received a copy of the GNU Lesser General Public
+License along with this library; if not, write to the Free Software
+Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston, MA
+02110-1301 USA
+
+
+This file contains more detailed doctests for the stdnum.casrn module. It
+contains some corner case tests and tries to validate numbers that have been
+found online.
+
+>>> from stdnum import casrn
+>>> from stdnum.exceptions import *
+
+
+The number seems to always include separators so we introduce them if they
+are not present (but will fail validation if they are in the incorrect
+place or are inconsistently placed).
+
+>>> casrn.validate('329-65-7')
+'329-65-7'
+>>> casrn.validate('329657')
+'329-65-7'
+>>> casrn.validate('32-96-57')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+>>> casrn.validate('32965-7')
+Traceback (most recent call last):
+    ...
+InvalidFormat: ...
+
+
+The first component of a CAS RN can be 2 to 7 digits long.
+
+>>> casrn.validate('51-43-4')
+'51-43-4'
+>>> casrn.validate('1-43-4')
+Traceback (most recent call last):
+    ...
+InvalidLength: ...
+>>> casrn.validate('2040295-03-0')
+'2040295-03-0'
+>>> casrn.validate('12040295-03-0')
+Traceback (most recent call last):
+    ...
+InvalidLength: ...
+
+
+These should all be valid CAS Registry Numbers.
+
+>>> numbers = '''
+...
+... 51-43-4
+... 87-86-5
+... 150-05-0
+... 329-65-7
+... 608-93-5
+... 1305-78-8
+... 1344-09-8
+... 1972-08-3
+... 2650-18-2
+... 3087-16-9
+... 3524-62-7
+... 6104-58-1
+... 7440-44-0
+... 7440-47-3
+... 7732-18-5
+... 7782-40-3
+... 7782-42-5
+... 8007-40-7
+... 9031-72-5
+... 9032-02-4
+... 9035-40-9
+... 12627-53-1
+... 14314-42-2
+... 16065-83-1
+... 18540-29-9
+... 49863-03-8
+... 55480-22-3
+... 56182-07-1
+... 60679-64-3
+... 70051-97-7
+... 126266-35-1
+... 126371-03-7
+... 153250-52-3
+... 308067-58-5
+... 2040295-03-0
+...
+... '''
+>>> [x for x in numbers.splitlines() if x and not casrn.is_valid(x)]
+[]

-----------------------------------------------------------------------

Summary of changes:
 stdnum/{eu/banknote.py => casrn.py} |  50 +++++++++--------
 tests/test_casrn.doctest            | 105 ++++++++++++++++++++++++++++++++++++
 2 files changed, 132 insertions(+), 23 deletions(-)
 copy stdnum/{eu/banknote.py => casrn.py} (52%)
 create mode 100644 tests/test_casrn.doctest


hooks/post-receive
-- 
python-stdnum
-- 
To unsubscribe send an email to
python-stdnum-commits-unsubscribe@lists.arthurdejong.org or see
https://lists.arthurdejong.org/python-stdnum-commits/