A 1434 base pair human liver cDNA coding for the entire alpha 1-antitrypsin protein has been isolated and sequenced. Translation of the coding region into amino acids reveals a precursor molecule which contains a 24 amino acid signal peptide and 394 amino acids present in the mature polypeptide chain. The human gene for the S variant of alpha 1-antitrypsin has also been subcloned and sequenced. The gene is composed of 10226 nucleotide bases and is approximately equimolar for all 4 nucleotides. The gene contains four intervening sequences (introns) and 5' and 3' noncoding regions which are 54 and 79 nucleotides in length, respectively. A 5.3-kilobase intron exists in the 5' noncoding region and contains a 143 amino acid open reading frame, an Alu family sequence, and a pseudo transcription initiation region. No significant differences in base composition are seen between the introns and those regions corresponding to coding regions of the corresponding mRNA (exons). A sequence of 1951 nucleotides flanking the 5' end of the gene has also been determined and contains a "TATA" box sequence (TTAAA-TA) 21 nucleotides upstream from the proposed transcription start site. Comparison of the gene sequence with the cDNA sequence reveals a single base substitution (A----T), which results in a Glu----Val substitution at position 264 in the S variant protein. The position and size of introns, the overall base composition, and the codon preference for the alpha 1-anti-trypsin gene differ from those for the chicken ovalbumin gene even though the two proteins belong to a common protein family, as judged by amino acid sequence homology.