CAS Registry Number Validation using PL SQL

CAS Registry Numbers are universally used to provide a unique, unmistakable identifier for chemical substances. CAS stands for Chemical Abstracts Service, a division of the American Chemical Society, that assigns a unique numerical identifier to every chemical substance described in the open scientific literature.

The CAS Registry database contains more than 113 million unique organic and inorganic chemical substances, such as alloys, coordination compounds, minerals, mixtures, polymers and salts, and more than 66 million sequences. CAS Registry contains substances reported in the literature back to the early 1800s and over 15,000 substances are added daily. [1]

A CAS Registry Number itself has no inherent chemical significance but provides an unambiguous way to identify a chemical substance or molecular structure when there are many possible systematic, generic, proprietary or trivial names. [2] These numbers are assigned in a sequential, increasing order when the substance is identified by CAS scientists for inclusion in the CAS REGISTRY database,

A CAS number is separated by hyphens into three parts, the first consisting from two up to seven digits, the second consisting of two digits, and the third consisting of a single digit serving as a check digit. The check digit is calculated by taking the sum of last digit times 1, the previous digit times 2 and the still previous digit times 3 and so on, and finally computing the remainder upon dividing the sum by 10.

For example, the CAS number of water is 7732-18-5: the checksum 5 is calculated as (8×1 + 1×2 + 2×3 + 3×4 + 7×5 + 7×6) = 105; 105 mod 10 = 5.

I wrote a PL SQL function to validate CAS Registry Numbers for use in one of the database applications for a semiconductors company for managing the chemicals used in their chip manufacturing facilities.

The function f_is_valid_cas_number takes a CAS number as input and returns 1 if it is valid and 0 otherwise. The following examples show how the function can be used:

-- Valid CAS Number for Water, returns 1
SELECT f_is_valid_cas_number('7732-18-5') FROM dual;

-- Invalid syntax, returns 0
SELECT f_is_valid_cas_number('7A2-181-522') FROM dual;

-- Valid syntax but incorrect checksum, returns 0
SELECT f_is_valid_cas_number('1-11-5') FROM dual;

In an Oracle PL SQL program, the function can be used to set a flag as follows:

-- Valid CAS Number for Water, sets is_valid_cas_number flag to 1
is_valid_cas_number NUMBER(1) := f_is_valid_cas_number('7732-18-5');

The code for the function is available in the CAS Number Validator repository on Github. I hope you find it useful.