This introduction and additional information is available as a PDF file
The Chapman & Hall/CRC Chemical Database is a structured database holding information on chemical substances. It includes descriptive and numerical data on chemical, physical and biological properties of compounds; systematic and common names of compounds; literature references; structure diagrams and their associated connection tables. The Dictionary of Natural Products Online is a subset of this database and includes all compounds contained in the Dictionary of Natural Products (Main Work and Supplements).
The Dictionary of Natural Products (DNP) is the only comprehensive and fully-edited database on natural products. It arose as a daughter product of the well-known Dictionary of Organic Compounds (DOC) which, since its inception in the 1930s has, through successive editions, always been a leading source of natural product information.
In the early 1980s, following the publication of the Fifth Edition of DOC, the first to be founded on database methods, the Editors and contributors for the various classes of natural products embarked on a programme of enlargement, rationalisation and classification of the natural product entries, while at the same time keeping the coverage up-to-date. In 1992 the results of this major project, which had grown to match DOC in size, were separately published in both book (7 volumes) and CD-ROM format, leaving DOC with coverage of only the most widely distributed and/or practically important natural products. DNP compilation has since continued unabated by a combination of an exhaustive survey of current literature and of historical sources such as reviews to pick up minor natural products and items of data previously overlooked.
The compilation of DNP is undertaken by a team of academics and freelancers who work closely with the in-house editorial staff at Chapman & Hall. Each contributor specialises in a particular natural product class (e.g. alkaloids) and is able to reorganise and classify the data in the light of new research so as to present it in the most consistent and logical manner possible. Thus the compilation team is able to reconcile errors and inconsistencies.
The resulting on-line version represents an extremely well organised dictionary documenting virtually every known natural product.
A valuable feature of the design is that closely related natural products (e.g. where one is a glycoside or simple ester of another) are organised into the same entry, thus simplifying and bringing out the underlying structural and biosynthetic relationships of the compounds. Structure diagrams are drawn and numbered in the most consistent way according to best stereochemical and biogenetic relationships. In addition, every natural product is indexed by structural/biogenetic type under one of more than 1000 headings, allowing the rapid location of all compounds in the category, even where they have undergone biogenetic modification and no longer share exactly the same skeleton.
There is extensive (but not complete) coverage of natural products of unknown structure, and the coverage of these is currently being enhanced by various retrospective searches.
In the database, closely related compounds are grouped together to form an
entry. Stereoisomers and derivatives of a parent compound are all listed under
one entry. The compounds in the Dictionary of Natural Products are grouped
together into approximately 40,000 entries. The structure of an entry is shown
A simple entry covers one compound, with no derivatives or variants. A composite entry will start with the entry compound, then may have:
Variants may include stereoisomers, e.g. (R)-form, endo-form; members of a series of natural products with closely related structures such as antibiotic complexes.
For example, Trienomycins are often treated as variants although their structures may be more varied.
Derivatives may include hydrates, complexes, salts, classical organic derivatives, substitution products and oxidation products etc. Derivatives may exist on more than one functional group of an entry compound. The following techniques are among those used to bring together related substances in the same entry:
The format of a typical entry is given in Fig. 1, and shows the individual types
of data that may be present in an entry.
Chemical names and synonyms
All the names discussed below can be searched using the Chemical Name field.
Compounds have been named so as to facilitate access to their factual data by
keeping the nomenclature as simple as possible, whilst still adhering to good
practice as determined by IUPAC (the International Union of Pure and Applied
Chemistry). A great deal of care has been taken to achieve this aim as nearly as
possible. Some intentional departures from IUPAC terminological principles are
occasionally made to clarify the nomenclature of natural products. For example,
compounds containing both lactone and -COOH groups are often named using
two principal functional groups:
Fig. 1. Sample entry from database
Many other trivial appellations have from time to time appeared in the literature for other acyl groups (e.g., Senecioyl = 3-methyl-2-butenoyl, Feruloyl = 3-(4-hydroxy- 3-methoxyphenyl)-2-propenoyl or 4-hydroxy- 3-methoxycinnamoyl) but the systematic forms are usually employed except in a few cases where the shortened form is used to abbreviate a very long and unwieldy derivative descriptor as much as possible (e.g., for some of the complex flavonoid glycosides).
CAS Registry Numbers are identifying numbers allocated to each distinctly definable chemical substance indexed by the Chemical Abstracts Service since 1965 (plus retrospective allocation of numbers by CAS to compounds from the sixth and seventh collective index periods). The numbers have no chemical significance but they provide a label for each substance independent of any system of nomenclature.
In DNP, much effort has been expended to ensure that accurate CAS numbers are given for as many substances as possible.
If a CAS number is not given for a particular compound, it may be (a) because CAS have not allocated one, (b) very occasionally, because an editorial decision cannot be made as to the correct number to cite, or (c) because the substance was added to the DNP database at a late stage in the compilation process, in which case the number will probably be added to the database soon.
At the foot of the DNP entry, immediately before the references, may be shown additional registry numbers. These are numbers which have been recognised by the DNP editors or contributors as belonging to the entry concerned but which cannot be unequivocally assigned to any of the compounds covered by the entry. Their main use will be in helping those who need to carry out additional searches, especially online searches in the CAS or other databases, and who will be able to obtain additional hits using these numbers. Clearly, discretion is needed in their use for this purpose.
Additional registry numbers may arise for a variety of reasons:
In each entry display there is a single diagram which applies to the parent entry. Separate diagrams are not given for variants or derivatives.
Every attempt has been made to present the structures of chemical substances as accurately as possible according to current best practice and IUPAC recommendations. In drawing the formulae, as much consistency as possible between closely related structures has been aimed at. Thus, for example, sugars have been standardised as Haworth formulae and, wherever possible in complex structures, the rings are oriented in the standard Haworth manner so that structural comparisons can quickly be made. In formulae the pseudoatom abbreviations Me, Et and Ac for methyl, ethyl and acetyl respectively, are used only when attached to a heteroatom. Ph is used throughout whether attached to carbon or to a heteroatom. Other pseudoatom abbreviations such as Pri for isopropyl and Bz for benzoyl are not used in DNP.
Care must be taken with the numbering of natural products, as problems may arise due to differences in systematic and non-systematic schemes. Biogenetic numbering schemes which are generally favoured in DNP may not always be contiguous, e.g., where one or more carbon atoms have been lost during biogenesis.
Structures for derivatives can be viewed in Structure Search, but remember
that these structures are generated from connection tables and may not always
be oriented consistently.
Where the absolute configuration of a compound is known or can be inferred from the published literature without undue difficulty, this is indicated. Where only one stereoisomer is referred to in the text, the structural diagram indicates that stereoisomer. Wherever possible, stereostructures are described using the Cahn-Ingold-Prelog sequence-rule (R,S) and (E,Z) conventions but, in cases where these are cumbersome or inapplicable, alternatives such as the α,ß-system are used instead. Alternative designations are frequently presented in such cases.
The structure diagrams for compounds containing one or two chiral centres are given in DNP as Fischer-type diagrams showing the stereochemistry unequivocally. True Fischer diagrams in which the configuration is implied by the North-South-East-West positions of the substituents are widespread in the literature; they are quite unambiguous but need to be used with caution by the inexperienced. They cannot be reoriented without the risk of introducing errors.
Where only the relative configuration of a compound containing more than one chiral centre is known, the symbols (R*) and (S*) are used, the lowestnumbered chiral centre being arbitrarily assigned the symbol (R*). For racemic modifications of compounds containing more than one chiral centre the symbols (RS) and (SR) are used, with the lowest-numbered chiral centre being arbitrarily assigned the symbol (RS). The racemate of a compound containing one chiral centre only is described in DNP as (±)-.
In comparing CAS descriptors with those given in DNP, it is important to remember that the order of presentation of the chirality labels in CAS is itself based on the sequence rule priority and not on any numbering scheme, for example the CAS descriptor for the structure illustrated is [S-(R*,S*)].
The relative stereochemical label (R*,S*) is first applied with the R* applying to the chiral centre of higher priority (C-3). The absolute stereochemical descriptor (S)- is then applied changing R* to S for the chiral centre of higher priority and S* to R for the chiral centre of lower priority (C-2). For further details, see the current CAS Index Guide.
For simplicity, the enantiomers of bridged-ring compounds, such as camphor, are described simply as (+)- and (-)-. Although camphor has two chiral centres, steric restraints mean that only one pair of enantiomers can be prepared.
For further information on the (R,S)-system, see Cahn, R,S et al, J. Chem. Soc., 1951, 612; Experientia, 1956, 12, 81; Angew. Chem. Int. Ed. Engl., 1966, 5, 383.
Where appropriate, alternative stereochemical descriptors may be given using
the D, L or α,ß-systems. For a fuller description of these systems, consult The Organic Chemist's Desk Reference (Chapman & Hall, 1995).
The elements in the molecular formula are given according to the Hill convention (C, H, then other elements in alphabetical order). The molecular weights given are formula weights (or more strictly, molar masses in daltons) and are rounded to one place of decimals. In the case of some high molecular mass substances such as proteins the value quoted may be that taken from an original literature source and may be an aggregate molar mass.
Molecular formulae are included in DNP for all derivatives which are natural products and so are readily searchable, whether they are documented as derivatives or have their own individual entry. Molecular formulae are not in general given for salts, hydrates or complexes (e.g. picrates) nor for most "characterisation" derivatives such as acetates and methyl ethers of complex natural products.
Where a derivative appears to have characterised only as a salt, the properties
of the salt may be given under the heading for the derivative. In such cases the
data is clearly labelled, e.g., Mp 179° (as hydrochloride).
The taxonomic names for organisms given throughout are in general those given
in the primary literature. Standardisation of minor orthographical variations has
been carried out. Data in this field may be searched under Source/Synthesis or
All Text. Standards used are: Brummitt, R.K. (1992) Vascular Plant Families
and Genera, Royal Botanic Gardens, Kew; Willis, J.C. (1973) A Dictionary of
the Flowering Plants, Cambridge University Press, Cambridge; Gozmany, L.
(1990) Seven Language Thesaurus of European Animals, Chapman & Hall
London; Chemical Abstracts Service.
Care has been taken to make the information given on the importance and uses
of chemical substances as accurate as possible. Data in this field may be
searched under Use/Importance or All Text.
All natural products are classified under one of more than 1050 headings according to structural type, e.g., daucane sesquiterpenoid, pyrrolizidine alkaloid, withanolide. Each structural type is assigned as a type of compound code, e.g., VG0300, VX0150. Type of compound words and type of compound codes may both be searched in Menu and Command search.
The full type of compound code index is given in Table 3, page 128 of the
printed User Manual, and in the Description of Natural Product Structures that
follows, each descriptive paragraph is followed by its Type of Compound
Natural products are considered to be colourless unless otherwise stated. Where the compound contains a chromophore which would be expected to lead to a visible colour, but no colour is mentioned in the literature, the DNP entry will mention this fact if it has been noticed by the contributor.
An indication of crystal form and of recrystallisation solvent is often given
but these are imprecise items of data; most organic compounds can be
crystallised from several solvent systems and the crystal form often varies. In
the case of the small number of compounds where crystal behaviour has been
intensively studied (e.g. pharmaceuticals), it is found that polymorphism is a
very common phenomenon and there is no reason to believe that it is not
widespread among organic compounds generally.
Melting points and boiling points
The policy followed in the case of conflicting data is as follows:
These are given whenever possible, and normally refer to what the DNP contributor believes to be the best-characterised sample of highest chemical and optical purity. Where available an indication of the optical purity (op) or enantiomeric excess (ee) of the sample measured now follows the specific rotation value.
Specific rotations are dimensionless numbers and the degree sign which was
formerly universal in the literature has been discontinued.
Densities and refractive indexes
Densities and refractive indexes are now of less importance for the identification of liquids than has been the case in the past, but are quoted for common or industrially important substances (e.g. monoterpenoids), or where no boiling point can be found in the literature.
Densities and refractive indexes are not quoted where the determination
appears to refer to an undefined mixture of stereoisomers.
Solubilities are given only where the solubility is unusual. Typical organic
compounds are soluble in the usual organic solvents such as ether and
chloroform, and virtually insoluble in water. The presence of polar groups (OH,
NH2 and especially COOH, SO3H, NR+) increases water solubility.
pKa values are given for both acids and bases. The pKb of a base can be
obtained by subtracting its pKa from 14.17 (at 20°) or from 14.00 (at 25°).
Spectroscopic data such as uv wavelengths and extinction coefficients are given only where the spectrum is a main point of interest, or where the compound is unstable and has been identified only by spectroscopic data.
In many other cases, spectroscopic data can be rapidly located through the
Toxicity and hazard information is highlighted by the sign , and has been selected to assist in risk assessments for experimental, manufacturing and manipulative procedures with chemicals.
The field of safety testing is a complex, difficult and rapidly expanding one, and while as much care as possible has been taken to ensure the accuracy of reported data, the Dictionary must not be considered a comprehensive source on hazard data. The function of the reported hazard data is to alert the user to possible hazards associated with the use of a particular compound, but the absence of such data cannot be taken as an indication of safety in use, and the Publishers cannot be held responsible for any inaccuracies in the reported information, neither does the omission of hazard data in DNP imply an absence of this data from the literature. Widely recognised hazards are included however, and where possible key toxicity reviews are identified in the references. Further advice on the storage, handling and disposal of chemicals is given in The Organic Chemist's Desk Reference.
Finally, it should be emphasised that any chemical has the potential for harm
if it is carelessly used. For many newly isolated materials, hazardous properties
may not be apparent or may have been cited in the literature. In addition, the
toxicity of some very reactive chemicals may not have been evaluated for
ethical reasons, and these substances in particular should be handled with
RTECS® Accession Numbers*
Many entries in DNP contain one or more RTECS® Accession Numbers. Possession
of these numbers allows users to locate toxicity information on relevant
substances from the NIOSH Registry of Toxic Effects of Chemical Substances,
which is a compendium of toxicity data extracted from the scientific literature.
For each Accession Number, the RTECS® database provides the following
data when available: substance prime name and synonyms; date when the
substance record was last updated; CAS Registry Number; molecular weight
and formula; reproductive, tumorigenic and toxic dose data; and citations to
aquatic toxicity ratings, IARC reviews, ACGIH Threshold Limit Values,
toxicological reviews, existing Federal standards, the NIOSH criteria document
program for recommended standards, the NIOSH current intelligence program,
the NCI Carcinogenesis Testing Program, and the EPA Toxic Substances
Control Act inventory. Each data line and citation is referenced to the source
from which the information was extracted.
The selection of references is made with the aim of facilitating entry into the literature for the user who wishes to locate more detailed information about a particular compound. Thus, in general, recent references are preferred to older ones, particularly for chiral compounds where optical purity and absolute configuration may have been determined relatively recently. The number of references quoted cannot therefore be taken as an indication of the relative importance of a compound, and the references quoted for important substances may not be the most significant historically.
References are given in date order except for references to spectroscopic library collections, which sort at the top of the list, and those to hazard/toxicity sources which sort at the bottom.
The content of most references is indicated by means of suffixes, known as reference tags. A list of the most common ones is given in Table 4, p. 145 of the printed User Manual. For references describing a minor natural product which has been included in DNP as a derivative of a parent compound, the reference tag may be the identifying name of the natural product, e.g. (Laciniatoside II).
Some reference suffixes are now given in boldface type, where the editors consider the reference to be particularly important, for example the best synthesis giving full experimental details and often claiming a higher yield than previously reported methods.
In some entries, minor items of information, particularly the physical
properties of derivatives, may arise from references not cited in the entry.
In general these are uniform with the Chemical Abstracts Service Source Index
(CASSI) listing except for a short list of very common journals:
The database is continually updated. When an entry is undergoing revision at
the time of a on-line release (for example by the addition of further
derivatives or references), this is indicated by a message at the head of the entry.
*RTECS® Accession Numbers are compiled and distributed by the National Institute for Occupational Safety and Health Service of the U.S. Department of Health and Human Services of The United States of America. All rights reserved. (1996)
DNP 22.2 Copyright © 2014 Taylor & Francis Group
All Rights Reserved