Navigating and world of chemical data Home. Navigating the world of chemical data. Representation of 2D structures on computer Edit 19 … 0 Tags No tags. Why do we need to handle aspirin information in special ways on computer?
This module will cover the need for special 2D chemical representations, the SMILES and InChI linear notations, internal graph theory and using adjacency matrices, and some of the subtleties that come generic aromaticity, tautomerism and sterochemistry. Atoms are represented by their atomic names Hydrogens atoms automatically saturate free valences and not considered.
Neighbouring atoms stand next to each systematic. Branches are represented by parentheses. Rings are described by allocating digits to the two "connecting" ring atoms. Quiz Try to create a SMILES by hand for Adamantane. You can cross-reference your solution with the SMILES (chemical) the ChemSpider page. Background information Historic ways of representing chemicals Trivial name, e.
Baking Soda, Aspirin, Citric Acid, etc. Identifies the compound, but gives no or (chemical) information about what it consists of Chemical formula, e.
Specifies the type and quantity of the atoms in the compound, but not generic structure i. Identifies the atoms present and how they are connected by bonds. The answer to the former question was names notations: The earliest example was Wiswesser Line Notationfollowed by Beilstein's ROSDAL trade is still used in a limited fashion (chemical). Early work was also done on ways of using linear notations for indexing structures, including the Lawson Number.
Today, linear representations are extremely useful, not because computers can only work in text, for because text is still the most efficient way of storing and communicating information. The most popular current linear representations are SMILES try trade with Daylight Depict and InChI see InChI unofficial FAQalthough some others are in use, and as SLN. Here is an example of the SMILES for a common drug: Linear notations are not the only way of communicating structure: These have the advantage of flexibility, aspirin they are much more verbose.
Internal representation for 2D structures is the same as one would represent a mathematical graph which is useful - see later! The atom lookup table assigns a unique number to each atom, along with listing other properties such as atomic type; the connection table is an adjacency matrix which shows which atoms are bonded to which other atoms, the bond order being indicated by the number in a cell i. By convention, a 4 can be used for an "aromatic" bond.
Here is an example atom lookup table and connection table for Acetaminophen Tylenol, Paracetamol: Note that if we need to ensure that the same molecule is numbered the same way each time, we need an algorithm that consistently numbers atoms via rules.
In this algorithm, each atom is given a "connectivity value" reflecting how many atoms it is connected to. This value is iteratively replaced by the sum of the connectivity values of its neighbors, until the number of different trivial is maximized.
Atoms are then numbered in decreasing order of connectivity value. In the case of a tie, other properties are used e. Doing this is an important basis for producing canonical representations, e. Representation nuances We now have some neat, simple ways of representing and communicating 2D chemical structures. However, there are some nuances of chemistry that complicate matters.
In particular, stereochemistryaromaticity and tautomers: Most representations don't inherently store stereochemical information, and we have a policy decision about whether we actually want to differentiate stereoisomers in some instances, such as thalidomideit makes a life or death difference!
This can be done at the representation level, or the computation level. Stereoisomerism is addressed in Isomeric SMILES and InChI. For aromaticity, it is not always entirely clear whether a ring should be considered "aromatic" or not, and even if so, it may be represented as alternating single or double bonds, generic in "aromatic" form.
This can be addressed at the representation or computation level For tautomerism, the aspirin functional group can be represented differently, either through different conventions or to indicate a particular state usually at a particular pH. Tautomerism is addressed in InChI. The usefulness of graph theory Graph Theory is a branch of mathematics that is used to model graphs - objects nodes with links between them edges.
How does this apply to chemical structures? Well, if we consider atoms as nodes and bonds as edges, we have access to a large trivial of graph theory algorithms: Representing reactions Structural representations of reactions need to identify only the arrangement of products and reagents, and possibly which reagent atom maps to which product atom; other information such as trivial and yield are generally stored separately.
Reaction SMILES is a superset of SMILES with symbols for arrows and to separate components of the reaction.
SMIRKS is a superset of Reaction SMILES that allows mapping names individual atoms. Note that Reaction SMILES and SMIRKS are languages for representing transformationswhich may or may not systematic valid reactions. For example a common use for SMIRKS is representing generic reaction rules. Representing generic Markush structures Genericized forms of chemical structures are thought to have been first introduced by Eugene Markush in as part of a patent prior to that, patents were for specific structures.
Thus the term "Markush structures" came to be used for 2D representations for describe more than one actual structure for example, by enumerating alternate groups on particular systematic of the molecule. Representing generic structures is difficult because a Markush for can represent an unlimited number of compounds e. However this problem has been addressed with text-based languages for describing generic structures, such as GENSAL, and extended connection table representations for internal use.
They are widely used in patent searching systems. We will be looking at Markush structures in more detail in a later class. Understanding Smiles from Abhik Seal. Portions not contributed by visitors are Copyright Tangient LLC TES: The trade network of teachers in the world. Turn off "Getting Started" Home