In order to understand the regulatory mechanisms establishing and maintaining HOXA-9 gene expression, structural information about the gene is a prerequisite. Therefore, we sequenced the 7.2-kb region of the human HOXA-9 gene and mapped the positions of two partial cDNAs consisting of one of two 5' exons, AB (358 bp) or CD (568 bp), and a common 3' exon (exon II), which are separated by 5.4- and 1.0-kb introns, respectively. When the amino acid sequence homologies were compared with those of other Hox genes belonging to the same paralogous group, exon CD exhibited the strongest homology: 73% of 91 aa residues exactly matched those of chicken Hoxa-9. An intermediate exon (90 bp) was detected within exon CD. It was surrounded by a splice acceptor and a donor at both the 5' and 3' ends, and one branchpoint site was found near the splice-acceptor site. Nucleotide sequence analysis along this region revealed two TATA boxes, one CAAT box, one GC box, and one each of the following binding sites--engrailed, eve-stripe2-hb3, and Krox20--just upstream of exon CD. A CpG island and two RARE repeats were detected within intron I. Northern blot analysis showed that at least four main transcripts were generated along this region: all fetal tissues tested (brain, lung, liver, and kidney) produced a 1.8-kb homeobox-containing transcript (HA-9A); a 2.2- and a 3.3-kb transcript were generated from exon CD and exon II (HA-9B), especially in fetal and adult kidneys as well as in adult skeletal muscle; the 1.0-kb transcript was likely to be generated by the intermediate exon in all adult and fetal tissues. Several weak bands without tissue specificity were likely to be contributed by the hybrid transcripts between HOXA-9 and the other HOXA gene(s). Together, these results may account for the unique degree of conservation of the HOX cluster in general.