Clostridium thermosulfurogenes EM1 produced a thermostable (up to 70 degrees C) beta-galactosidase (beta Gal) with a pH optimum of 7 during growth on lactose. The gene (lacZ) encoding this enzyme was cloned and expressed in Escherichia coli using pUC18 as a vector. The nucleotide sequence of a 2.7-kb PstI fragment carrying the lacZ gene was determined. The open reading frame for lacZ, which encoded a protein of 716 amino acids with a calculated Mr of 83,728, was confirmed by the identity of its deduced aa sequence with the chemically determined N-terminal aa sequence of the purified beta Gal of C. thermosulfurogenes EM1. The structural gene was preceded by a possible promoter sequence, 5'-TTGTAG (-35), 5'-TAATAT (-10); and a ribosome-binding site, 5'-AGGAGG. The cloned beta Gal was found to be indistinguishable from the native enzyme. The Mr of the active beta Gal was 170,000, as determined by Superose 12HR gel filtration and gradient gel electrophoresis. This indicated that this enzyme is composed of two identical subunits. Comparison of the aa sequences of different beta Gal revealed that five large regions of similarity with the enzymes from E. coli (lacZ, ebgA), Klebsiella pneumoniae (lacZ), and Lactobacillus bulgaricus are present in the beta Gal of C. thermosulfurogenes EM1 and that the putative active site residues (Glu461 and Tyr503 in the E. coli lacZ-encoded beta Gal) are conserved (Glu389 and Tyr429). Therefore, the thermostable beta Gal of C. thermosulfurogenes EM1 is more closely related to the enzyme of E. coli than to the likewise thermostable one of Bacillus stearothermophilus.(ABSTRACT TRUNCATED AT 250 WORDS)