A MDL Molfile is a file format created by MDL and now owned by Symyx, for holding information about the atoms, bonds, connectivity and coordinates of a molecule. The molfile consists of some header information, the Connection Table (CT) containing atom info, then bond connections and types, followed by sections for more complex information.

The molfile is sufficiently common that most, if not all, cheminformatics software systems/applications are able to read the format, though not always to the same degree. It is also supported by some computational software such as Mathematica.

There are different versions, the current de facto standard is the V2000 molfile, though more recently the V3000 format has been circulating in large-enough volumes to be an issue for those unable to read V3000-format files.

MDL publishes a specification of their Connection Table formats, which include Molfile and SD formats.

2. MDL SD file format

SDF is one of a family of file formats from MDL holding chemical data, especially structure information. “SDF” stands for structure-data file and SDF files actually wrap the molfile (MDL_Molfile) format. Multiple compounds are separated by a delimiter, a line of four dollar signs ($$$$). A feature of SDF is the possibility of storing associated data items.

Associated data items are denoted as follows:

> <Unique_ID>

XCA3464366
 
>  <ClogP>
5.825

>  <Vendor>
Sigma

>  <Molecular Weight>
499.611

Some SDF import programs (e.g. ISIS/Base) require that the first data field after the molecule data (in the example above, Unique_ID) be a unique identifier for each record.

Multiple data items are possible on multiple lines. The MDL SDF format specifications require a hard carriage return to be inserted in any text field exceeding 200 characters in length. This is frequently violated in practice, as many SMILES and InChI strings exceed this limit.

 

[Wikipedia, http://en.wikipedia.org/wiki/Chemical_table_file]